The top three languages in Data Science right now are:
Python (pandas, numpy, pytorch)
R (tidyverse)
SQL
Probably in that order.
Learning how to work with data is similar from on language to another.
R, I think, is the most convenient way to get students working with Data.
For vizualization: the ggplot ecosystem is nicer than pandas.
The RStudio and R work very well together.
See this site for a longer history with fewer pictures.
“S is a language that was developed [in 1976] by John Chambers and others at the old Bell Telephone Laboratories”
TIBCO is the owner of S
Here’s a book on S
Ross Ihaka and Robert Gentleman of the university of Auckland Developed it.
The versions of R are named after Peanuts comics or cartoons.
Why? 4.4.2 release
Other versions
Comprehensive R Archive Network
They manage all R updates and R packages. There are currently 19897 packages.
They put on the UseR! conference
There’s an R magazine
Hadley Wickham and others made it.
Started in 2009 as RStudio
The tidyverse comes out in 2016
Renamed Posit last year.
Offers Python, SQL, and other DS language Support
pytorch is becoming increasingly an increasingly popular method for deep learning in python.
R has torch, which is an implementation of pytorch in R.
Posit has made a pivot toward integrating Python and Jupyter into their ecosystem with Quarto.
Quarto is a document that works with Python, R, and other languges.
Positron is an IDE independent of language forked from VS code.
Photo of John Chambers taken from Stanford’s website.
Photo of Ross Ihaka - https://www.auckland.ac.nz/content/auckland/en/science/about-the-faculty/department-of-statistics/ihaka-lecture-series/jcr:content/leftpar/imagecomponent/image.img.1024.medium.jpg/1561079330278.jpg
Photo by Robert Gentalman - https://calendar.gatech.edu/sites/default/files/styles/large/public/hg_media/2024-25/Robert_Gentleman.jpg.webp?itok=0aDlOIOK
Photo by Hadley Wickham - Private correspondence, CC BY-SA 4.0, https://commons.wikimedia.org/w/index.php?curid=46810731
Most of the details of this short history came from chapter 2 of Roger Peng’s R Programming for Data Science.
https://bookdown.org/rdpeng/rprogdatascience/history-and-overview-of-r.html