Introduction
The main goal is to reinforce our understanding of R and RStudio, which we will be using throughout the workshop both to learn the statistical concepts discussed in the course and to analyze real data and come to informed conclusions.
R is the name of the programming language itself and RStudio is a convenient interface.
Before we get to that stage, however, you need to build some basic fluency in R. Today we begin with the fundamental building blocks of R and RStudio: the interface, reading in data, and basic commands.
Learning goals
By the end of the lab, you will…
- Be familiar with the workflow using R, RStudio, Git, and GitHub
- Gain practice writing a reproducible report using RMarkdown
- Practice version control using GitHub
- Be able to create data visualizations using
ggplot2
- Be able to describe variable distributions and the relationship between multiple variables
Getting started
R and R Studio
Below are the components of the RStudio IDE.

Below are the components of a Quarto (.qmd) file.

Packages
We will use the following package in today’s lab.
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr 1.1.4 ✔ readr 2.1.5
✔ forcats 1.0.0 ✔ stringr 1.5.1
✔ ggplot2 3.5.1 ✔ tibble 3.2.1
✔ lubridate 1.9.3 ✔ tidyr 1.3.1
✔ purrr 1.0.2
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag() masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
The tidyverse is a meta-package. When you load it you get eight packages loaded for you:
- ggplot2: for data visualization
- dplyr: for data wrangling
- tidyr: for data tidying and rectangling
- readr: for reading and writing data
- tibble: for modern, tidy data frames
- stringr: for string manipulation
- forcats: for dealing with factors
- purrr: for iteration with functional programming
The message that’s printed when you load the package tells you which versions of these packages are loaded as well as any conflicts they may have introduced, e.g., the filter() function from dplyr has now masked (overwritten) the filter() function available in base R (and that’s ok, we’ll use dplyr::filter() anyway).
We’ll be using functionality from all of these packages throughout the semester, though we’ll always load them all at once with library(tidyverse). You can find out more about the tidyverse and each of the packages that make it up here.