Lecture 2
Cornell University
INFO 5001 - Fall 2024
August 29, 2024
Discuss the following for the visualization.
What is the visualization trying to show?
What is effective, i.e. what is done well?
What is ineffective, i.e. what could be improved?
What are you curious about after looking at the visualization?
A measurement of representation of women in film
In order to pass the test, a movie must have:
Inspiration: FiveThirtyEight
and clone the repo in RStudio.ae-00-bechdel-viz.qmd
, review the document, and fill in the blanks.Warning
is hosted on GitHub.com because we have not configured your authentication method for Cornell’s GitHub. We will do this tomorrow in lab.
s.color = binary
vs. color = "pink"
when faceting (creating small multiples) by one variable and facet_grid()
when faceting by two variables.# extract response and convert to tibble
bechdel_df <- bechdel_resp |>
resp_body_json() |>
enframe(name = ".id", value = "result") |>
# tidy into one row per film
unnest_wider(col = result) |>
# convert rating to factor column
mutate(rating = factor(
x = rating,
levels = 0:3,
labels = c(
"Fewer than two women",
"Women don't talk to each other",
"Women only talk about men",
"Passes Bechdel test"
) |>
# ensure labels are wrapped for the plot
str_wrap(width = 18)
# summarized values for chart
bechdel_pct <- bechdel_df |>
count(year, rating) |>
complete(year, rating, fill = list(n = 0)) |>
mutate(n_pct = n / sum(n), .by = year) |>
filter(between(year, 1970, 2023))
# labels for plot
bechdel_labels <- bechdel_pct |>
filter(year == max(year)) |>
# calculate midpoint for each category to center on last year
mutate(midpoint = rev(cumsum(rev(n_pct))) - n_pct + (n_pct / 2))
# initiate ggplot object
ggplot(data = bechdel_pct, mapping = aes(x = year, y = n_pct)) +
# area chart for change in percentages over time
geom_area(mapping = aes(fill = rating), color = "white") +
# add text labels directly to the chart
data = bechdel_labels,
mapping = aes(y = midpoint, label = rating, color = rating),
family = "Atkinson Hyperlegible",
size = 2.5,
hjust = 0,
nudge_x = 1
) +
# add vertical lines to visually separate decades
xintercept = seq(from = 1970, to = 2020, by = 10), color = "white",
linewidth = 0.25
) +
# percentage labels for y axis
scale_y_continuous(labels = label_percent()) +
# manual color palette
values = c(darken("#21918c", amount = 0.3), "#21918c", lighten("#21918c", amount = 0.3), "#440154"),
aesthetics = c("fill", "color"),
guide = "none"
) +
# label the chart
title = "The Bechdel Test over time",
subtitle = "How women are represented in movies",
x = NULL,
y = NULL,
caption = "Source: bechdeltest.com; FiveThirtyEight"
) +
# allow chart contents outside of panel
coord_cartesian(clip = "off") +
# different base theme and font
theme_minimal(base_family = "Atkinson Hyperlegible") +
# customize theme elements
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
axis.ticks.length.y = unit(1, units = "cm"),
plot.title.position = "plot",
plot.margin = margin(r = 50, l = 5, t = 5, b = 5)