Debugging tips + tools

Lecture 15

Dr. Benjamin Soltoff

Cornell University
INFO 5001 - Fall 2024

October 22, 2024

Announcements

Announcements

  • Homework 04
  • Exam 01
  • Project exploration

Debugging

Grace Hopper standing in front of UNIVAC in 1961.

Bugs

An error, flaw, failure or fault in a computer program or system that causes it to produce an incorrect or unexpected result, or to behave in unintended ways.

A cartoon of a fuzzy round monster face showing 10 different emotions experienced during the process of debugging code. The progression goes from (1) "I got this" - looking determined and optimistic; (2) "Huh. Really thought that was it." - looking a bit baffled; (3) "..." - looking up at the ceiling in thought; (4) "Fine. Restarting." - looking a bit annoyed; (5) "OH WTF." Looking very frazzled and frustrated; (6) "Zombie meltdown." - looking like a full meltdown; (7) (blank) - sleeping; (8) "A NEW HOPE!" - a happy looking monster with a lightbulb above; (9) "insert awesome theme song" - looking determined and typing away; (10) "I love coding" - arms raised in victory with a big smile, with confetti falling.

Approach to debugging

  1. Realize that you have a bug
  2. Make it repeatable
  3. Figure out where it is
  4. Fix it and test it

Realize that you have a bug

Condition system

  • Signal a condition
  • Handle (or ignore) a condition

Fatal errors

addition <- function(x, y){
  if(!is_numeric(c(x, y))) stop("One of your inputs is not a number.")
  
  x + y
}

addition(3, "abc")
Error in is_numeric(c(x, y)): could not find function "is_numeric"

Warnings

logit <- function(x){
  log(x / (1 - x))
}

logit(-1)
Warning in log(x/(1 - x)): NaNs produced
[1] NaN

Warnings

logit <- function(x){
 if (x < 0 | x > 1) stop('x not between 0 and 1')
 log(x / (1 - x))
}

logit(-1)
Error in logit(-1): x not between 0 and 1

Warnings

logit <- function(x){
  x <- if_else(x < 0 | x > 1, NA_real_, x)
  if (is.na(x)) warning('x not between 0 and 1')
  log(x / (1 - x))
}

logit(-1)
Warning in logit(-1): x not between 0 and 1
[1] NA

Messages

ggplot(diamonds, aes(carat, price)) +
  geom_point() +
  geom_smooth()
`geom_smooth()` using method = 'gam' and formula = 'y ~ s(x, bs = "cs")'

Suppressing messages

```{r}
#| label: suppress-message
#| message: false
demo_message <- function() message("This is a message")
demo_message()
```
```{r}
#| label: suppress-print
#| message: false
demo_print <- function() print("This is a message")
demo_print()
```
[1] "This is a message"

What if there are no conditions?

An error, flaw, failure or fault in a computer program or system that causes it to produce an incorrect or unexpected result, or to behave in unintended ways.

  • Bugs can occur without any conditions in R
  • If the output is unintended, it is a bug

Make it repeatable

An animated GIF from the film 'Jerry Maguire' where Jerry (played by Tom Cruise) is pleading with an off-screen character saying 'Help me help you.'

A side-by-side comparison of a monster providing problematic code to tech support when it is on a bunch of crumpled, disorganized papers, with both monsters looking sad and very stressed (left), compared to victorious looking monsters celebrating when code is provided in a nice box with a bow labeled 'reprex'. Title text reads 'reprex: make reproducible examples. Help them help everyone!'

Reproducible examples

  • reprex (noun)
  • reprex
  • reprex::reprex()

Why reprexes?

Easier to talk about code that:

  • Actually runs
  • That I don’t have to run
  • But I can easily run

Prior attempts to reprex

Generate a reprex using reprex()

library(tidyverse)
count(diamonds, colour)
Error in `count()`:
! Must group by variables found in `.data`.
✖ Column `colour` is not found.

Reprex do’s and don’ts

  • Ensure the example is fully reproducible
  • Use the smallest, simplest, most built-in data possible
  • Consider including session_info = TRUE
  • Use good coding style to ensure the readability of your code by other human beings

Figure out where it is

The call stack

f <- function(a) g(a)
g <- function(b) h(b)
h <- function(c) i(c)
i <- function(d) {
  if (!is.numeric(d)) {
    stop("`d` must be numeric", call. = FALSE)
  }
  d + 10
}
f("a")
Error: `d` must be numeric
traceback()
5: stop("`d` must be numeric", call. = FALSE) at #3
4: i(c) at #1
3: h(b) at #1
2: g(a) at #1
1: f("a")

Slow down, simplify, and do small things

  • Run code incrementally (especially valuable for piped operations)
  • Strip out extraneous operations
  • Simplify to a minimal reproducible example

Unexpected output

Expected to use the “plasma” color palette from viridis. What went wrong?

library(tidyverse)

ggplot(data = mpg, mapping = aes(x = displ, y = hwy, color = drv)) +
  geom_point() +
  labs(x = "Displacement",
       y = "Highway MPG",
       color = "Drive") +
  scale_fill_viridis_d(option = "plasma", end = 0.9) +
  theme_minimal() +
  theme(legend.position = "bottom")

Strip it down to a basic plot

ggplot(data = mpg, mapping = aes(x = displ, y = hwy, color = drv)) +
  geom_point()

Add simplified broken part

ggplot(data = mpg, mapping = aes(x = displ, y = hwy, color = drv)) +
  geom_point() +
  scale_fill_viridis_d()

Figure out broken part

ggplot(data = mpg, mapping = aes(x = displ, y = hwy, color = drv)) +
  geom_point() +
  scale_color_viridis_d()

Add some parts back in

ggplot(data = mpg, mapping = aes(x = displ, y = hwy, color = drv)) +
  geom_point() +
  scale_color_viridis_d(option = "plasma", end = 0.9)

Add the rest back in

ggplot(data = mpg, mapping = aes(x = displ, y = hwy, color = drv)) +
  geom_point() +
  labs(x = "Displacement",
       y = "Highway MPG",
       color = "Drive") +
  scale_color_viridis_d(option = "plasma", end = 0.9) +
  theme_minimal() +
  theme(legend.position = "bottom")

Fix it and test it

Dealing with bugs

  • Unexpected errors
  • Expected errors
    • Add conditions
    • Use safely()

Dealing with failure using safely()

  • Adverb - modifies a function (verb)
  • Always returns a list with two elements
    1. result
    2. error

Dealing with failure using safely()

safe_sqrt <- safely(sqrt)
str(safe_sqrt(9))
List of 2
 $ result: num 3
 $ error : NULL
str(safe_sqrt("a"))
List of 2
 $ result: NULL
 $ error :List of 2
  ..$ message: chr "non-numeric argument to mathematical function"
  ..$ call   : language .Primitive("sqrt")(x)
  ..- attr(*, "class")= chr [1:3] "simpleError" "error" "condition"

safely() and map()

x <- list("a", 4, 5)

# unsafely square root
y <- map(x, sqrt)
Error in `map()`:
ℹ In index: 1.
Caused by error:
! non-numeric argument to mathematical function
# safely log
y <- map(x, safely(sqrt))
str(y)
List of 3
 $ :List of 2
  ..$ result: NULL
  ..$ error :List of 2
  .. ..$ message: chr "non-numeric argument to mathematical function"
  .. ..$ call   : language .Primitive("sqrt")(x)
  .. ..- attr(*, "class")= chr [1:3] "simpleError" "error" "condition"
 $ :List of 2
  ..$ result: num 2
  ..$ error : NULL
 $ :List of 2
  ..$ result: num 2.24
  ..$ error : NULL

transpose()

y <- transpose(y)
str(y)
List of 2
 $ result:List of 3
  ..$ : NULL
  ..$ : num 2
  ..$ : num 2.24
 $ error :List of 3
  ..$ :List of 2
  .. ..$ message: chr "non-numeric argument to mathematical function"
  .. ..$ call   : language .Primitive("sqrt")(x)
  .. ..- attr(*, "class")= chr [1:3] "simpleError" "error" "condition"
  ..$ : NULL
  ..$ : NULL

Application exercise

ae-13

  • Go to the course GitHub org and find your ae-13 (repo name will be suffixed with your GitHub name).
  • Clone the repo in RStudio, run renv::restore() to install the required packages, open the Quarto document in the repo, and follow along and complete the exercises.
  • Render, commit, and push your edits by the AE deadline – end of the day

Recap

  • Debugging is a frustrating, complicated skill that you need to learn
  • Use conditions or careful observations to determine if you have a bug
  • Generate a reprex in order to make the bug repeatable
  • Use the call stack and/or intuition to track down the bug
  • Fix the bug or use condition handling to deal with it