Wrap-up: Where to go from here

Lecture 25

Dr. Benjamin Soltoff

Cornell University
INFO 5001 - Fall 2023

2023-11-29

End-of-semester logistics

Remaining assignments

  • Homework 06
  • Extra credit
  • Group project

Build a simple data science stack

A digital cartoon with two illustrations: the top shows the R-logo with a scary face, and a small scared little fuzzy monster holding up a white flag in surrender while under a dark storm cloud. The text above says "at first I was like..." The lower cartoon is a friendly, smiling R-logo jumping up to give a happy fuzzy monster a high-five under a smiling sun and next to colorful flowers. The text above the bottom illustration reads "but now it’s like..."

RStudio Workbench

  • Access to RStudio Workbench will end at some point after December 17th
  • All INFO 5001 materials remain available in your repos on GitHub as long as you are an active student
  • Any other work you have done on the server will not be accessible after the end of the semester
  • Where will you go from here?

Software installation

Install some core R packages

# install the major packages from the course published on CRAN
install.packages(c(
  "tidyverse", "tidymodels", "devtools", "usethis",
  "colorspace", "janitor", "skimr", "tidytext"
))

# install a package hosted on GitHub
remotes::install_github(repo = "cis-ds/rcis")

Create a GitHub account

Configure Git

usethis::use_git_config(
  user.name = "Your name", 
  user.email = "Email associated with your GitHub account"
  )

Painless authentication with PAT

Personal Access Token

  • Uses HTTPS protocol
  • More flexible than SSH
  • Integrates with the usethis package to automate some Git workflows
  • Alternative: continue using SSH

Setup PAT authentication

Create PAT

usethis::create_github_token(
  scopes = c("repo", "user", "gist", "workflow"),
  description = "<DESCRIBE YOUR DEVICE>"
)

Store PAT

gitcreds::gitcreds_set()

#> ? Enter password or token: ghp_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
#> -> Adding new credentials...
#> -> Removing credentials from cache...
#> -> Done.

Create PAT

usethis::create_github_token(
  scopes = c("repo", "user", "gist", "workflow"),
  description = "<DESCRIBE YOUR DEVICE>",
  host = "https://github.coecis.cornell.edu/"
)

Store PAT

gitcreds::gitcreds_set(url = "https://github.coecis.cornell.edu/")

#> ? Enter password or token: ghp_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
#> -> Adding new credentials...
#> -> Removing credentials from cache...
#> -> Done.

What have you learned?

Learning objectives for INFO 5001

  • Construct and execute basic programs in R using elementary programming techniques and tidyverse packages (e.g. loops, conditional statements, user-defined functions).
  • Implement data science workflows using common, reproducible methods and software tools.
  • Apply stylistic principles of coding to generate reusable, interpretable code.
  • Utilize reference documentation and debugging tools to troubleshoot problems.
  • Identify and use external libraries to expand on base functions
  • Apply Git and GitHub workflows for version control.

Where to go from here

Courses

  • INFO 5101: Learning Analytics
  • INFO 5312: Data Communication
  • INFO 5371: Studying Social Inequality Using Data Science
  • STSCI 5080: Probability Models and Inference
  • STSCI 5090: Theory of Statistics
  • STSCI 5740: Data Mining and Machine Learning
  • CS/Statistics/Data Science

Find a community

Two fuzzy monsters standing side-by-side outside of a door frame through which is a magical wonderland of different R communities, with a "mind blown" rainbow coming out of the one closest to the door. A welcome mat says "Welcome."

Online communities

Keep your skills fresh

Presentations on Monday!