CSI-MTH-190
You may not have palmerpenguins installed. You’ll know if you get the following message:
Error in `library()`:
! there is no package called ‘palmerpenguins’
If so paste the install line into the console and type return/enter. (Remove the #).
In this lecture I’ll use one set of data called palmerpenguins to illustrate some of the ideas you should consider when building a graph.
We’ll focus mostly on the numeric variables in this lecture, although many of the ideas are similar for both numeric and categorical variables.
Recall how we can use summarize from lab 2 to calculate descriptive statistics.
# Putting inside of summarize is helpful
penguins |>
summarize(
body_mass = mean( body_mass_g, na.rm=TRUE)
)# A tibble: 1 × 1
body_mass
<dbl>
1 4202.
Below we can calculate the mean() with base R (i.e. without summarize()).
Find the median flipper length using summarize().
To find the median flipper length using summarize() you would use this code in a chunk:
Boxplots are helpful for finding outliers.
Try to make a boxplot of the penguins bill_depth_mm color it by island.
If you want to mark part of the graph, sometimes a vertical line is helpful. You simply add on a geom_vline() layer.
We can break code into smaller piece called facets.
To see the spread, center and shape of a numeric variable.
We can adjust the bin or bin size.
Copy some code above as a template and try to make a histogram of body_mass_g.
Add a vertical line at the mean() value, fill by sex and facet by island.
Here is my solution to the problem on the previous slide.