This data, or a summary of it is often made public
We will not be collecting Data
That task is for a methods course (Psychology has such a course). When you collect data you need to do it in a thoughtful way, which requires a fair amount of training.
We will be using Data
We want to use data to gleen information.
Import Data
We first have to import data into whatever software we are using to analyze it.
We’ll import using three methods in this lecture and We’ll see more as the semester progresses.
read_csv()
read_sheets()
packages
APIs
Scraping from websites
Importing a .csv file.
A .csv file is a special type of text document. csv is short for “comma separated values.” Essentially its a text file with commas separating the data within the file.
There are other ways of separating data (such as with tabs).
The tidyverse has a function called read_csv() which will take a .csv file and read it into memory so you can use it.
Importing a google sheets file
Googlesheets are also very common ways of storing data. There is a package called googlesheets4 that has the function read_sheets() for reading in data from a google sheet.
Importing from a package
Packages like the tidyverse already have data built in. Often times these data are for educational purposes and are sometimes very out of date.
We’ll import from a package first, as it is the easiest.
Example 1.
Let’s import txhousing with the tidyverse. This data is loaded automatically when we load the tidyverse.
# first load the tidyverselibrary(tidyverse)# next call the data with txhousing to view it.txhousing
Next we have to move it to our current working directory. In most cases this is folder this .qmd file is in, if not we can set it manually in rstudio, which I’ll do in the video of this lecture.
Finally we have to load the tidyverse and read in the data and save it to memory.