For each of the following pairs of variables, a statistically significant positive relationship has been observed. Identify confounding that might cause the spurious correlation.
The amount of ice cream sold in New England and the number of deaths by drowning
The number of doctors in a region and the number of crimes committed in that region
Explore Categorical Data
We’ll be exploring categorical data using R
Chapter 4
Exploring Categorical Data:
library(tidyverse)library(openintro)# You should read and view the documentation of the data as a first step to exploring it. ?assortive_matingassortive_mating
# A tibble: 204 × 2
self_male partner_female
<fct> <fct>
1 blue blue
2 blue blue
3 blue blue
4 blue blue
5 blue blue
6 blue blue
7 blue blue
8 blue blue
9 blue blue
10 blue blue
# ℹ 194 more rows
Exploring categorical data
Let’s explore categorical data using summary statistics and visualizations.
Tables
Tables are also helpful in understanding categorical data
table(data = assortive_mating)
partner_female
self_male blue brown green
blue 78 23 13
brown 19 23 12
green 11 9 16
Prop tables
# Here I am saving the table as the variable my_table# I am telling table I specifically want to look at the male and female variables.my_table <-table(assortive_mating$self_male, assortive_mating$partner_female)prop.table(my_table)
blue brown green
blue 0.38235294 0.11274510 0.06372549
brown 0.09313725 0.11274510 0.05882353
green 0.05392157 0.04411765 0.07843137
Margins
addmargins(A = my_table)
blue brown green Sum
blue 78 23 13 114
brown 19 23 12 54
green 11 9 16 36
Sum 108 55 41 204