Customizing Quarto reports and presentations

Lecture 18

Dr. Benjamin Soltoff

Cornell University
INFO 5001 - Fall 2023

2023-10-30

Announcements

Mike from 'Jersey Shore' stating at a wedding toast 'You survived that sh*t'.

Announcements

Project exploration due Thursday

`ae-16`

Go to the course GitHub org and find your ae-16 (repo name will be suffixed with your GitHub name).
Clone the repo in RStudio Workbench, open the Quarto document in the repo, and follow along and complete the exercises.
Render, commit, and push your edits by the AE deadline – end of tomorrow

Quarto

Quarto basics

---
title: Gun deaths
author: Your name
date: today
format: html
---

```{r}
#| label: setup
#| include: false
library(tidyverse)
library(rcis)

youth <- gun_deaths |>
  filter(age <= 65)
```

# Gun deaths by age

We have data about `r nrow(gun_deaths)` individuals killed by guns. Only `r nrow(gun_deaths) - nrow(youth)` are older than 65. The distribution of the remainder is shown below:

```{r}
#| label: youth-dist
#| echo: false
ggplot(data = youth, mapping = aes(x = age)) + 
  geom_freqpoly(binwidth = 1)
```

# Gun deaths by race

```{r}
#| label: race-dist
youth |>
  mutate(race = fct_infreq(race) |> fct_rev()) |>
  ggplot(mapping = aes(y = race)) +
  geom_bar() +
  labs(y = "Victim race")
```

Major components

A YAML header surrounded by ---s
Chunks of code surounded by ```
Text mixed with simple text formatting using the Markdown syntax

Quarto code chunks

Rendering process

A schematic representing rendering of Quarto documents from .qmd, to knitr or jupyter, to plain text markdown, then converted by pandoc into any number of output types including html, PDF, or Word document.

Rendering process

A schematic representing the multi-language input (e.g. Python, R, Observable, Julia) and multi-format output (e.g. PDF, html, Word documents, and more) versatility of Quarto.

🤔 AE: Edit the Quarto document

Render gun-deaths.qmd as an HTML document
Add text describing the frequency polygon

05:00

Code chunks

```{r}
#| label: youth-dist
#| message: false
#| warning: false

# code goes here
```

Naming code chunks
Code chunk options
eval: true
include: true
echo: true
message: true or warning: true
cache: true

```{r}
#| cache: true
scdb_case <- read_csv("data/scdb-case.csv") |>
  filter(term >= 1945)
```

```{r}
#| cache: true
scdb_clean <- scdb_case |> 
  mutate(one_vote = majVotes - minVotes == 1)
scdb_clean
```

# A tibble: 9,299 × 53
   caseId docketId caseIssuesId dateDecision decisionType usCite sctCite ledCite
   <chr>  <chr>    <chr>        <chr>               <dbl> <chr>  <chr>   <chr>  
 1 1945-… 1945-00… 1945-001-01… 12/10/1945              1 326 U… 66 S. … 90 L. …
 2 1945-… 1945-00… 1945-002-01… 12/3/1945               1 326 U… 66 S. … 90 L. …
 3 1945-… 1945-00… 1945-003-01… 11/13/1945              1 326 U… 66 S. … 90 L. …
 4 1945-… 1945-00… 1945-004-01… 11/13/1945              1 326 U… 66 S. … 90 L. …
 5 1945-… 1945-00… 1945-005-01… 11/5/1945               1 326 U… 66 S. … 90 L. …
 6 1945-… 1945-00… 1945-006-01… 11/5/1945               1 326 U… 66 S. … 90 L. …
 7 1945-… 1945-00… 1945-007-01… 11/5/1945               2 326 U… 66 S. … 90 L. …
 8 1945-… 1945-00… 1945-008-01… 11/5/1945               1 326 U… 66 S. … 90 L. …
 9 1945-… 1945-00… 1945-009-01… 11/5/1945               1 326 U… 66 S. … 90 L. …
10 1945-… 1945-01… 1945-010-01… 12/10/1945              1 326 U… 66 S. … 90 L. …
# ℹ 9,289 more rows
# ℹ 45 more variables: lexisCite <chr>, term <dbl>, naturalCourt <dbl>,
#   chief <chr>, docket <chr>, caseName <chr>, dateArgument <chr>,
#   dateRearg <chr>, petitioner <dbl>, petitionerState <dbl>, respondent <dbl>,
#   respondentState <dbl>, jurisdiction <dbl>, adminAction <dbl>,
#   adminActionState <dbl>, threeJudgeFdc <dbl>, caseOrigin <dbl>,
#   caseOriginState <dbl>, caseSource <dbl>, caseSourceState <dbl>, …

```{r}
#| cache: true
scdb_case <- read_csv("data/scdb-case.csv")
```

```{r}
#| cache: true
scdb_clean <- scdb_case |> 
  mutate(one_vote = majVotes - minVotes == 1)
scdb_clean
```

# A tibble: 9,299 × 53
   caseId docketId caseIssuesId dateDecision decisionType usCite sctCite ledCite
   <chr>  <chr>    <chr>        <chr>               <dbl> <chr>  <chr>   <chr>  
 1 1945-… 1945-00… 1945-001-01… 12/10/1945              1 326 U… 66 S. … 90 L. …
 2 1945-… 1945-00… 1945-002-01… 12/3/1945               1 326 U… 66 S. … 90 L. …
 3 1945-… 1945-00… 1945-003-01… 11/13/1945              1 326 U… 66 S. … 90 L. …
 4 1945-… 1945-00… 1945-004-01… 11/13/1945              1 326 U… 66 S. … 90 L. …
 5 1945-… 1945-00… 1945-005-01… 11/5/1945               1 326 U… 66 S. … 90 L. …
 6 1945-… 1945-00… 1945-006-01… 11/5/1945               1 326 U… 66 S. … 90 L. …
 7 1945-… 1945-00… 1945-007-01… 11/5/1945               2 326 U… 66 S. … 90 L. …
 8 1945-… 1945-00… 1945-008-01… 11/5/1945               1 326 U… 66 S. … 90 L. …
 9 1945-… 1945-00… 1945-009-01… 11/5/1945               1 326 U… 66 S. … 90 L. …
10 1945-… 1945-01… 1945-010-01… 12/10/1945              1 326 U… 66 S. … 90 L. …
# ℹ 9,289 more rows
# ℹ 45 more variables: lexisCite <chr>, term <dbl>, naturalCourt <dbl>,
#   chief <chr>, docket <chr>, caseName <chr>, dateArgument <chr>,
#   dateRearg <chr>, petitioner <dbl>, petitionerState <dbl>, respondent <dbl>,
#   respondentState <dbl>, jurisdiction <dbl>, adminAction <dbl>,
#   adminActionState <dbl>, threeJudgeFdc <dbl>, caseOrigin <dbl>,
#   caseOriginState <dbl>, caseSource <dbl>, caseSourceState <dbl>, …

Label your chunks

```{r}
#| label: raw-data-cache
#| cache: true
scdb_case <- read_csv("data/scdb-case.csv")
```

```{r}
#| label: processed-data-cache
#| cache: true
#| dependson: raw-data-cache
scdb_clean <- scdb_case |> 
  mutate(one_vote = majVotes - minVotes == 1)
scdb_clean
```

# A tibble: 29,021 × 53
   caseId docketId caseIssuesId dateDecision decisionType usCite sctCite ledCite
   <chr>  <chr>    <chr>        <chr>               <dbl> <chr>  <chr>   <chr>  
 1 1791-… 1791-00… 1791-001-01… 8/3/1791                6 2 U.S… <NA>    1 L. E…
 2 1791-… 1791-00… 1791-002-01… 8/3/1791                2 2 U.S… <NA>    1 L. E…
 3 1792-… 1792-00… 1792-001-01… 2/14/1792               2 2 U.S… <NA>    1 L. E…
 4 1792-… 1792-00… 1792-002-01… 8/7/1792                2 2 U.S… <NA>    1 L. E…
 5 1792-… 1792-00… 1792-003-01… 8/11/1792               8 2 U.S… <NA>    1 L. E…
 6 1792-… 1792-00… 1792-004-01… 8/11/1792               6 2 U.S… <NA>    1 L. E…
 7 1793-… 1793-00… 1793-001-01… 2/19/1793               8 2 U.S… <NA>    1 L. E…
 8 1793-… 1793-00… 1793-002-01… 2/20/1793               2 2 U.S… <NA>    1 L. E…
 9 1793-… 1793-00… 1793-003-01… 2/20/1793               8 2 U.S… <NA>    1 L. E…
10 1794-… 1794-00… 1794-001-01… 2/7/1794               NA 3 U.S… <NA>    1 L. E…
# ℹ 29,011 more rows
# ℹ 45 more variables: lexisCite <chr>, term <dbl>, naturalCourt <dbl>,
#   chief <chr>, docket <chr>, caseName <chr>, dateArgument <chr>,
#   dateRearg <chr>, petitioner <dbl>, petitionerState <dbl>, respondent <dbl>,
#   respondentState <dbl>, jurisdiction <dbl>, adminAction <dbl>,
#   adminActionState <dbl>, threeJudgeFdc <dbl>, caseOrigin <dbl>,
#   caseOriginState <dbl>, caseSource <dbl>, caseSourceState <dbl>, …

Caching guidelines

Label your code chunks
Define dependencies
Never cache chunks that load packages

Inline code

We have data about `r nrow(gun_deaths)` individuals killed by guns.

Only `r nrow(gun_deaths) - nrow(youth)` are older than 65. The distribution of the remainder is shown below:

We have data about 100798 individuals killed by guns.

Only 15687 are older than 65.

🤔 AE: Modify chunk options

Set echo: false for each code chunk
Adjust the figure height and width options for the code chunks with plots
Enable caching for each chunk and render the document. Look at the file structure for the cache. What do you see?

07:00

YAML header

---
title: Gun deaths
author: Benjamin Soltoff
date: today
format: html
---

YAML Ain’t Markup Language
Standardized format for storing hierarchical data in a human-readable syntax
Defines how quarto renders your .qmd file

HTML document

---
title: Gun deaths
author: Benjamin Soltoff
date: today
format: html
---

---
title: Gun deaths
author: Benjamin Soltoff
date: today
format:
  html:
    toc: true
    toc-depth: 2
---

Appearance and style

---
title: Gun deaths
author: Benjamin Soltoff
date: today
format:
  html:
    theme: superhero
    highlight-style: github
---

Global options

---
title: "My Document"
format:
  html:
    fig-width: 7
execute:
  echo: true
  message: false
knitr:
  opts_chunk: 
    comment: "#>" 
---

Default document-level options
Some options are set with format
Some options are set with execute
Some options are set by knitr/jupyter

🤔 AE: Modify YAML options

Add a table of contents
Use themes for light and dark mode
Set relevant code chunk options globally

07:00

PDF document

---
title: Gun deaths
author: Benjamin Soltoff
date: today
format: pdf
---

Presentation

---
title: Gun deaths
author: Benjamin Soltoff
date: today
format: revealjs
---

Quarto supports multiple presentation formats

revealjs (HTML)
pptx (PowerPoint)
beamer (\(\LaTeX\)/PDF)

R scripts

# gun-deaths.R
# 2022-04-18
# Examine the distribution of age of victims in gun_deaths

# load packages
library(tidyverse)
library(rcis)

# filter data for under 65
youth <- gun_deaths |>
  filter(age <= 65)

# number of individuals under 65 killed
nrow(gun_deaths) - nrow(youth)

# graph the distribution of youth
ggplot(data = youth, mapping = aes(x = age)) +
  geom_freqpoly(binwidth = 1)

# graph the distribution of youth, by race
youth |>
  mutate(race = fct_infreq(race) |> fct_rev()) |>
  ggplot(mapping = aes(y = race)) +
  geom_bar() +
  labs(y = "Victim race")

When to use a script

For troubleshooting
Initial stages of project
Building a reproducible pipeline
It depends

Running scripts

Interactively
Programmatically using source()

Recap

Quarto is an open-source, reproducible document system
Compatible with R, Python, Julia, Observable, and more
Supports multiple output formats

Customizing Quarto reports and presentations

Announcements

Announcements

`ae-16`

Quarto

Quarto basics

Major components

Quarto code chunks

Rendering process

Rendering process

🤔 AE: Edit the Quarto document

Code chunks

Caching with dependencies

Label your chunks

Caching guidelines

Inline code

🤔 AE: Modify chunk options

YAML header

YAML header

HTML document

Table of contents

Appearance and style

Global options

🤔 AE: Modify YAML options

PDF document

Presentation

R scripts

When to use a script

Running scripts

Recap

It’s Halloween! 🎃