Wrap-up: Where to go from here

Lecture 26

Dr. Benjamin Soltoff

Cornell University
INFO 5001 - Fall 2025

December 4, 2025

End-of-semester logistics

Remaining assignments

  • Project presentations tomorrow
  • Remaining project components on Monday
  • Extra credit
  • Final exam

Build a simple data science stack

A digital cartoon with two illustrations: the top shows the R-logo with a scary face, and a small scared little fuzzy monster holding up a white flag in surrender while under a dark storm cloud. The text above says "at first I was like..." The lower cartoon is a friendly, smiling R-logo jumping up to give a happy fuzzy monster a high-five under a smiling sun and next to colorful flowers. The text above the bottom illustration reads "but now it’s like..."

Posit Workbench

  • Access to Posit Workbench will end at some point after December 26th
  • All INFO 5001 materials remain available in your repos on GitHub as long as you are an active student
  • Any other work you have done on the server will not be accessible after the end of the semester
  • Where will you go from here?

Software installation

Programming language

Reproducibility

To {renv} or not to {renv}

Benefits

  • Isolated
  • Portable
  • Reproducible

Drawbacks

  • It’s a pain to configure for every project
  • Some packages have issues installing via {renv}

Install some core R packages

# install the major packages from the course published on CRAN
install.packages(c(
  "tidyverse",
  "tidymodels",
  "devtools",
  "usethis",
  "colorspace",
  "janitor",
  "skimr"
))

# install a package hosted on GitHub
remotes::install_github(repo = "cis-ds/rcis")

Create a GitHub account

Configure Git

usethis::use_git_config(
  user.name = "Your name", 
  user.email = "Email associated with your GitHub account"
  )

Painless authentication with PAT

Personal Access Token

Setup PAT authentication

Create PAT

usethis::create_github_token(
  scopes = c("repo", "user", "gist", "workflow"),
  description = "<DESCRIBE YOUR DEVICE>"
)

Store PAT

gitcreds::gitcreds_set()

#> ? Enter password or token: ghp_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
#> -> Adding new credentials...
#> -> Removing credentials from cache...
#> -> Done.

Create PAT

usethis::create_github_token(
  scopes = c("repo", "user", "gist", "workflow"),
  description = "<DESCRIBE YOUR DEVICE>",
  host = "https://github.coecis.cornell.edu/"
)

Store PAT

gitcreds::gitcreds_set(url = "https://github.coecis.cornell.edu/")

#> ? Enter password or token: ghp_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
#> -> Adding new credentials...
#> -> Removing credentials from cache...
#> -> Done.

What have you learned?

Learning objectives for INFO 5001

  • Construct and execute basic programs in R using elementary programming techniques and {tidyverse} packages (e.g. loops, conditional statements, user-defined functions).
  • Implement data science workflows using common, reproducible methods and software tools.
  • Responsibly utilize large language models (LLMs) and generative AI tools to assist with coding and data analysis.
  • Apply stylistic principles of coding to generate reusable, interpretable code.
  • Utilize reference documentation and debugging tools to troubleshoot problems.
  • Apply Git and GitHub workflows for version control.

Where to go from here

Courses

  • INFO 5101: Learning Analytics
  • INFO 5312: Data Communication
  • STSCI 5080: Probability Models and Inference
  • STSCI 5090: Theory of Statistics
  • STSCI 5740: Data Mining and Machine Learning
  • CS/Statistics/Data Science

Find a community

Two fuzzy monsters standing side-by-side outside of a door frame through which is a magical wonderland of different R communities, with a "mind blown" rainbow coming out of the one closest to the door. A welcome mat says "Welcome."

Online communities

Keep your skills fresh

Presentations tomorrow!