By the end of this unit, the student will be able to:
here, fs, styler, lintr).tidyverse as a coherent ecosystem for data analysis.Code is not just for machines — it’s for humans too. Well-written code is:
R does not enforce a style, but widely adopted conventions exist:
| Type | Recommended Style | Example | Usage |
|---|---|---|---|
| Objects, variables, functions | snake_case |
customer_data, calculate_mean |
Standard in tidyverse |
| S3/S4 Classes, constructors | PascalCase |
DataFrame, LinearModel |
Base and advanced packages |
| Constants | ALL_CAPS |
PI = 3.1416 |
Optional, rarely used in R |
| Temporary variables | . or short names |
.x, tmp |
Only in functions or pipes |
✅ Recommendation: Use snake_case for everything, unless you are developing a package with classes.
x <- 5 + 3, not x<-5+3.c(1, 2, 3), not c(1,2,3).%>% or break with + in ggplot2.# ❌ Long line
result <- filter(mtcars, cyl == 4 & mpg > 25 & wt < 2.5)
# ✅ Better with pipe
result <- mtcars %>%
filter(cyl == 4, mpg > 25, wt < 2.5)
stylerAutomatically formats your code according to tidyverse style guides.
# Install
install.packages("styler")
# Use on a script
styler::style_file("my_script.R")
# Use in RStudio: Ctrl + Shift + A (Windows/Linux) or Cmd + Shift + A (Mac)
lintrChecks for style errors and best practices in real time.
# Install
install.packages("lintr")
# Check a file
lintr::lint("my_script.R")
# RStudio integration: shown in the "Markers" panel
Packages are collections of functions, data, documentation, and compiled code that extend R’s capabilities. Base R includes ~30 packages; CRAN has 19,000+.
install.packages("dplyr")
install.packages(c("ggplot2", "readr", "lubridate")) # multiple
if (!require("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install("DESeq2")
# Install remotes if you don't have it
install.packages("remotes")
# Install from GitHub
remotes::install_github("tidyverse/ggplot2")
remotes::install_github("rstudio/leaflet")
library(dplyr) # loads and attaches package to search path
require(ggplot2) # similar, but returns TRUE/FALSE (useful in functions)
# Load without attaching (avoids naming conflicts)
dplyr::filter(mtcars, cyl == 4)
⚠️ Caution: Some packages have functions with the same name (e.g., filter in dplyr and stats). Use :: to specify.
# Update all packages
update.packages(ask = FALSE)
# Update a specific package
install.packages("dplyr", dependencies = TRUE)
# View outdated packages
old.packages()
# Remove a package
remove.packages("package_name")
renv — Reproducible EnvironmentsIdeal for projects that must be reproducible on other machines.
# Initialize renv in a project
renv::init()
# Install packages (saved locally in project)
renv::install("dplyr")
# Freeze current state
renv::snapshot()
# Restore environment on another machine
renv::restore()
Creates a renv.lock file with exact versions of all packages.
Using setwd() and relative paths ("../data/data.csv") is fragile and non-reproducible.
here::here()The here package automatically detects the project root (where the .Rproj file is) and builds paths from there.
# Install
install.packages("here")
# Use
library(here)
data_path <- here("data", "customers.csv")
data <- read.csv(data_path)
✅ Works the same on Windows, Mac, or Linux.
✅ No need to change working directory.
✅ Ideal for sharing projects.
fsFor advanced file and folder manipulation.
library(fs)
# Create directory
dir_create("output")
# List files
dir_ls("data/")
# Check if exists
file_exists(here("data", "customers.csv"))
Many base R functions are slow, inconsistent, or have unexpected behaviors. The modern ecosystem (tidyverse) offers superior alternatives.
| Base Function | Modern Alternative | Advantages |
|---|---|---|
read.csv() |
readr::read_csv() |
Faster, doesn’t convert strings to factors, explicit types |
data.frame() |
tibble::tibble() |
Doesn’t print 1000 rows by default, preserves types, clearer |
factor() |
forcats::as_factor() |
Better level handling, integrated with tidyverse |
strsplit() |
stringr::str_split() |
Consistent, always returns list or vector, more readable |
Sys.time() |
lubridate::now() |
More readable, easy date manipulation |
Example:
# ❌ Base R
data <- read.csv("data.csv", stringsAsFactors = FALSE)
data$date <- as.Date(data$date, "%Y-%m-%d")
# ✅ Modern
library(readr)
library(lubridate)
data <- read_csv(here("data", "data.csv")) %>%
mutate(date = ymd(date))
A collection of packages designed to work together, with consistent philosophy and grammar, primarily created by Hadley Wickham and team at Posit (formerly RStudio).
| Package | Purpose |
|---|---|
ggplot2 |
Data visualization |
dplyr |
Data frame manipulation |
tidyr |
Data cleaning and reshaping |
readr |
Reading flat files |
purrr |
Functional programming |
tibble |
Modern data frames |
stringr |
String manipulation |
forcats |
Factor manipulation |
lubridate |
Date manipulation |
# Install entire tidyverse
install.packages("tidyverse")
# Load (loads core packages)
library(tidyverse)
📌 Note: Loading tidyverse does not load all its packages — only the main ones. For stringr, lubridate, etc., you may sometimes need to load them explicitly if not in the search path.
# Error: function not found
filter(mtcars, cyl == 4)
# Solution
library(dplyr)
filter(mtcars, cyl == 4)
library(dplyr)
library(stats)
# Which filter is used?
filter(mtcars, cyl == 4) # Uses dplyr::filter (last loaded)
# Solution: be explicit
dplyr::filter(mtcars, cyl == 4)
# ❌ Fragile
setwd("C:/Users/Juan/Project/data")
data <- read.csv("customers.csv")
# ✅ Robust
library(here)
data <- read_csv(here("data", "customers.csv"))
Old versions may have bugs or incompatibilities. Use:
update.packages(ask = FALSE)
Or better, use renv to freeze versions in critical projects.
Before delivering or sharing your code, verify:
✅ You use snake_case for variable and function names.
✅ Your lines do not exceed 80 characters.
✅ You use here::here() for paths.
✅ You load packages with library() at the top of the script.
✅ You use modern functions (read_csv, tibble, etc.).
✅ Your code is formatted with styler.
✅ You’ve checked for errors with lintr.
✅ You’ve documented complex parts with comments.
✅ Your project has a clear folder structure: /data, /scripts, /output, /docs.
✅ You use renv if the project must be reproducible in another environment.
data/, scripts/, output/, docs/.mtcars as data/cars.csv).scripts/cleaning.R that:tidyverse, here).here().output/filtered_cars.csv.styler to format the script.lintr::lint() and fix warnings.renv and freeze the environment.tidyverse Style Guide: https://style.tidyverse.org/ here Documentation: https://here.r-lib.org/ renv Documentation: https://rstudio.github.io/renv/ styler and lintr Cheatsheet: https://www.rstudio.com/resources/cheatsheets/ ✅ With this unit, you’ve laid the foundation for a professional, reproducible, and scalable workflow in R. You’re now ready to dive into the powerful world of the tidyverse in the next module.