Data Wrangling

Campaign finance

Ben Baumer

fec20: the elections

candidates: master table of ~5,000 candidates

-   e.g., `BIDEN, JOSEPH R JR / HARRIS, KAMALA D.`

committees: master table of ~17,000 committees

-   ## e.g., `TRUMP MAKE AMERICA GREAT AGAIN COMMITTEE`

Link to committee types

fec20: the money

pac: Summary of PAC activity ~12,000 rows

Sampled data: 1,000 records each

  • contributions
  • individuals
  • expenditures
  • transactions

contributions: Contributions from committees to candidates ~500,000 transactions

How is the data relational?

For un sampled Data

  • Need to fetch full results with read_all_contributions()
  • Can be for or against candidate!

You should probably just ignore the other tables!!

File Description

Ex: Who “gave” to Biden?

library(fec20)
biden_id <- candidates %>%
  filter(
    cand_election_yr == 2020, 
    cand_pty_affiliation == "DEM", 
    str_detect(cand_name, "BIDEN")
  ) %>%
  pull(cand_id)
biden_id
[1] "P80000722"

Ex: Who “gave” to Biden? (cont’d)

contributions %>%
  filter(cand_id == biden_id) %>%
  group_by(cmte_id) %>%
  summarize(
    num_transactions = n(),
    total = sum(transaction_amt)
  ) %>%
  arrange(desc(total)) %>%
  left_join(committees, by = "cmte_id") %>%
  select(num_transactions, total, cmte_nm)
# A tibble: 8 × 3
  num_transactions total cmte_nm                          
             <int> <dbl> <chr>                            
1                3 28934 PRIORITIES USA ACTION            
2                1  4390 SIERRA CLUB INDEPENDENT ACTION   
3                2  3000 JSTREETPAC                       
4                1  2957 INDIVISIBLE ACTION               
5                1  1100 INDEPENDENTS FOR PROSPERITY, INC.
6                4   437 WORKING ARIZONA PAC              
7                1   408 WORKING AMERICA                  
8                1    50 JEWS FOR JOE 2020                

Huh?

contributions %>%
  filter(cand_id == biden_id) %>%
  group_by(cmte_id, transaction_tp) %>%    #<<
  summarize(.groups = "drop",
    num_transactions = n(),
    total = sum(transaction_amt)
  ) %>%
  arrange(desc(total)) %>%
  left_join(committees, by = "cmte_id") %>%
  select(transaction_tp, num_transactions, total, cmte_nm)
# A tibble: 8 × 4
  transaction_tp num_transactions total cmte_nm                          
  <chr>                     <int> <dbl> <chr>                            
1 24E                           3 28934 PRIORITIES USA ACTION            
2 24E                           1  4390 SIERRA CLUB INDEPENDENT ACTION   
3 24K                           2  3000 JSTREETPAC                       
4 24E                           1  2957 INDIVISIBLE ACTION               
5 24A                           1  1100 INDEPENDENTS FOR PROSPERITY, INC.
6 24E                           4   437 WORKING ARIZONA PAC              
7 24E                           1   408 WORKING AMERICA                  
8 24E                           1    50 JEWS FOR JOE 2020                

(transaction types)[https://www.fec.gov/campaign-finance-data/transaction-type-code-descriptions/]

Corporate contributions

Is there a way to view the monetary values of contributions not made by individuals?

Missing data

  • Why are employer and occupation all empty in contributions dataset?

  • It happens in cand_id in committees dataset as well.

  • Lots of data is missing from the dataset and the labels are defiantly more confusing than other datasets in terms of what the categories are labeled.

  • That’s what real data is like

Committees?

It doesn’t say to whom the various individuals made the donations to, is there a way we can find that out?

  • individuals don’t give to candidates, they give to committees

  • committees spend on behalf of or against candidates

  • use cmte_id and cand_id to link tables

  • note that contributions table has both

different types of committees

Negative amounts?

In the individuals and transactions data, how were some of the transaction amounts negative or zero?

  • donations can be returned

  • Pay attention to transaction_type codes:

    • 24A: Independent expenditure opposing election of candidate
    • 24E: Independent expenditure advocating election of candidate