📘 Unit 1.4: Best Practices and Package Management

🎯 Learning Objectives

By the end of this unit, the student will be able to:

Apply style and naming conventions to write readable and professional R code.
Install, load, and manage packages from CRAN, Bioconductor, and GitHub.
Use modern tools to enhance productivity and reproducibility (here, fs, styler, lintr).
Understand the importance of the tidyverse as a coherent ecosystem for data analysis.
Avoid common errors when managing paths, dependencies, and work environments.

📚 1. Code Style and Readability

1.1. Why Style Matters

Code is not just for machines — it’s for humans too. Well-written code is:

Easy to read and understand (by you and others).
Easy to maintain and modify.
Less prone to errors.
Professional and ready for collaboration or production.

1.2. Naming Conventions

R does not enforce a style, but widely adopted conventions exist:

Type	Recommended Style	Example	Usage
Objects, variables, functions	`snake_case`	`customer_data`, `calculate_mean`	Standard in `tidyverse`
S3/S4 Classes, constructors	`PascalCase`	`DataFrame`, `LinearModel`	Base and advanced packages
Constants	`ALL_CAPS`	`PI = 3.1416`	Optional, rarely used in R
Temporary variables	`.` or short names	`.x`, `tmp`	Only in functions or pipes

✅ Recommendation: Use snake_case for everything, unless you are developing a package with classes.

1.3. Code Formatting and Structure

Indentation and spacing

Use 2 spaces (not tabs) for indentation.
Put spaces around operators: x <- 5 + 3, not x<-5+3.
After commas: c(1, 2, 3), not c(1,2,3).

Line length

Maximum 80 characters per line (ideal for terminal and diff readability).
If a line is too long, use %>% or break with + in ggplot2.

# ❌ Long line
result <- filter(mtcars, cyl == 4 & mpg > 25 & wt < 2.5)

# ✅ Better with pipe
result <- mtcars %>%
  filter(cyl == 4, mpg > 25, wt < 2.5)

1.4. Automated Formatting Tools

`styler`

Automatically formats your code according to tidyverse style guides.

# Install
install.packages("styler")

# Use on a script
styler::style_file("my_script.R")

# Use in RStudio: Ctrl + Shift + A (Windows/Linux) or Cmd + Shift + A (Mac)

`lintr`

Checks for style errors and best practices in real time.

# Install
install.packages("lintr")

# Check a file
lintr::lint("my_script.R")

# RStudio integration: shown in the "Markers" panel

📦 2. Package Management

2.1. What are R Packages?

Packages are collections of functions, data, documentation, and compiled code that extend R’s capabilities. Base R includes ~30 packages; CRAN has 19,000+.

2.2. Installing Packages

From CRAN (main repository)

install.packages("dplyr")
install.packages(c("ggplot2", "readr", "lubridate")) # multiple

From Bioconductor (biology, genomics)

if (!require("BiocManager", quietly = TRUE))
    install.packages("BiocManager")

BiocManager::install("DESeq2")

From GitHub (development version)

# Install remotes if you don't have it
install.packages("remotes")

# Install from GitHub
remotes::install_github("tidyverse/ggplot2")
remotes::install_github("rstudio/leaflet")

2.3. Loading Packages

library(dplyr)     # loads and attaches package to search path
require(ggplot2)   # similar, but returns TRUE/FALSE (useful in functions)

# Load without attaching (avoids naming conflicts)
dplyr::filter(mtcars, cyl == 4)

⚠️ Caution: Some packages have functions with the same name (e.g., filter in dplyr and stats). Use :: to specify.

2.4. Updating and Removing Packages

# Update all packages
update.packages(ask = FALSE)

# Update a specific package
install.packages("dplyr", dependencies = TRUE)

# View outdated packages
old.packages()

# Remove a package
remove.packages("package_name")

2.5. Dependency and Environment Management

`renv` — Reproducible Environments

Ideal for projects that must be reproducible on other machines.

# Initialize renv in a project
renv::init()

# Install packages (saved locally in project)
renv::install("dplyr")

# Freeze current state
renv::snapshot()

# Restore environment on another machine
renv::restore()

Creates a renv.lock file with exact versions of all packages.

🧭 3. File Paths and File Handling

3.1. The Problem with Relative Paths

Using setwd() and relative paths ("../data/data.csv") is fragile and non-reproducible.

3.2. Solution: `here::here()`

The here package automatically detects the project root (where the .Rproj file is) and builds paths from there.

# Install
install.packages("here")

# Use
library(here)

data_path <- here("data", "customers.csv")
data <- read.csv(data_path)

✅ Works the same on Windows, Mac, or Linux.
✅ No need to change working directory.
✅ Ideal for sharing projects.

3.3. Modern Alternative: `fs`

For advanced file and folder manipulation.

library(fs)

# Create directory
dir_create("output")

# List files
dir_ls("data/")

# Check if exists
file_exists(here("data", "customers.csv"))

🔄 4. Modern Alternatives to Base Functions

Many base R functions are slow, inconsistent, or have unexpected behaviors. The modern ecosystem (tidyverse) offers superior alternatives.

Base Function	Modern Alternative	Advantages
`read.csv()`	`readr::read_csv()`	Faster, doesn’t convert strings to factors, explicit types
`data.frame()`	`tibble::tibble()`	Doesn’t print 1000 rows by default, preserves types, clearer
`factor()`	`forcats::as_factor()`	Better level handling, integrated with tidyverse
`strsplit()`	`stringr::str_split()`	Consistent, always returns list or vector, more readable
`Sys.time()`	`lubridate::now()`	More readable, easy date manipulation

Example:

# ❌ Base R
data <- read.csv("data.csv", stringsAsFactors = FALSE)
data$date <- as.Date(data$date, "%Y-%m-%d")

# ✅ Modern
library(readr)
library(lubridate)

data <- read_csv(here("data", "data.csv")) %>%
  mutate(date = ymd(date))

🌐 5. Introduction to the Tidyverse

5.1. What is the Tidyverse?

A collection of packages designed to work together, with consistent philosophy and grammar, primarily created by Hadley Wickham and team at Posit (formerly RStudio).

5.2. Core Packages

Package	Purpose
`ggplot2`	Data visualization
`dplyr`	Data frame manipulation
`tidyr`	Data cleaning and reshaping
`readr`	Reading flat files
`purrr`	Functional programming
`tibble`	Modern data frames
`stringr`	String manipulation
`forcats`	Factor manipulation
`lubridate`	Date manipulation

5.3. Installation and Loading

# Install entire tidyverse
install.packages("tidyverse")

# Load (loads core packages)
library(tidyverse)

📌 Note: Loading tidyverse does not load all its packages — only the main ones. For stringr, lubridate, etc., you may sometimes need to load them explicitly if not in the search path.

🛑 6. Common Errors and How to Avoid Them

6.1. Forgetting to load a package

# Error: function not found
filter(mtcars, cyl == 4)

# Solution
library(dplyr)
filter(mtcars, cyl == 4)

6.2. Name conflicts

library(dplyr)
library(stats)

# Which filter is used?
filter(mtcars, cyl == 4) # Uses dplyr::filter (last loaded)

# Solution: be explicit
dplyr::filter(mtcars, cyl == 4)

6.3. Broken paths when sharing projects

# ❌ Fragile
setwd("C:/Users/Juan/Project/data")
data <- read.csv("customers.csv")

# ✅ Robust
library(here)
data <- read_csv(here("data", "customers.csv"))

6.4. Not updating packages

Old versions may have bugs or incompatibilities. Use:

update.packages(ask = FALSE)

Or better, use renv to freeze versions in critical projects.

📝 7. Best Practices Checklist

Before delivering or sharing your code, verify:

✅ You use snake_case for variable and function names.
✅ Your lines do not exceed 80 characters.
✅ You use here::here() for paths.
✅ You load packages with library() at the top of the script.
✅ You use modern functions (read_csv, tibble, etc.).
✅ Your code is formatted with styler.
✅ You’ve checked for errors with lintr.
✅ You’ve documented complex parts with comments.
✅ Your project has a clear folder structure: /data, /scripts, /output, /docs.
✅ You use renv if the project must be reproducible in another environment.

🧪 Practical Exercise: “Code Organizer” Project

Create a new project in RStudio.
Structure folders: data/, scripts/, output/, docs/.
Download a CSV dataset (e.g., save mtcars as data/cars.csv).
Create a script in scripts/cleaning.R that:
- Loads necessary packages (tidyverse, here).
- Reads the file using here().
- Performs a simple transformation (e.g., filter cars with more than 100 hp).
- Saves the result to output/filtered_cars.csv.
Use styler to format the script.
Run lintr::lint() and fix warnings.
(Optional) Initialize renv and freeze the environment.

📚 Additional Resources

tidyverse Style Guide: https://style.tidyverse.org/
here Documentation: https://here.r-lib.org/
renv Documentation: https://rstudio.github.io/renv/
CRAN Package List: https://cran.r-project.org/web/packages/available_packages_by_name.html
styler and lintr Cheatsheet: https://www.rstudio.com/resources/cheatsheets/

✅ With this unit, you’ve laid the foundation for a professional, reproducible, and scalable workflow in R. You’re now ready to dive into the powerful world of the tidyverse in the next module.

← Module03 Module05 →

Course Info

Course: R-zero-to-hero

Language: EN

Lesson: Module04