Debugging tips + tools

Lecture 16

Dr. Benjamin Soltoff

Cornell University
INFO 5001 - Fall 2023

2023-10-23

Announcements

Announcements

  • Continue working on group projects
  • Exam begins Thursday at 8am

Debugging

Grace Hopper standing in front of UNIVAC in 1961.

Bugs

An error, flaw, failure or fault in a computer program or system that causes it to produce an incorrect or unexpected result, or to behave in unintended ways.

A cartoon of a fuzzy round monster face showing 10 different emotions experienced during the process of debugging code. The progression goes from (1) "I got this" - looking determined and optimistic; (2) "Huh. Really thought that was it." - looking a bit baffled; (3) "..." - looking up at the ceiling in thought; (4) "Fine. Restarting." - looking a bit annoyed; (5) "OH WTF." Looking very frazzled and frustrated; (6) "Zombie meltdown." - looking like a full meltdown; (7) (blank) - sleeping; (8) "A NEW HOPE!" - a happy looking monster with a lightbulb above; (9) "insert awesome theme song" - looking determined and typing away; (10) "I love coding" - arms raised in victory with a big smile, with confetti falling.

Approach to debugging

  1. Realize that you have a bug
  2. Make it repeatable
  3. Figure out where it is
  4. Fix it and test it

Conditions in R

  1. Realize that you have a bug
  2. Make it repeatable
  3. Figure out where it is
  4. Fix it and test it

Condition system

  • Signal a condition
  • Handle (or ignore) a condition

Fatal errors

addition <- function(x, y){
  if(!is_numeric(c(x, y))) stop("One of your inputs is not a number.")
  
  x + y
}

addition(3, "abc")
Error in is_numeric(c(x, y)): could not find function "is_numeric"

Warnings

logit <- function(x){
  log(x / (1 - x))
}

logit(-1)
Warning in log(x/(1 - x)): NaNs produced
[1] NaN

Warnings

logit <- function(x){
 if (x < 0 | x > 1) stop('x not between 0 and 1')
 log(x / (1 - x))
}

logit(-1)
Error in logit(-1): x not between 0 and 1

Warnings

logit <- function(x){
  x <- if_else(x < 0 | x > 1, NA_real_, x)
  if (is.na(x)) warning('x not between 0 and 1')
  log(x / (1 - x))
}

logit(-1)
Warning in logit(-1): x not between 0 and 1
[1] NA

Messages

ggplot(diamonds, aes(carat, price)) +
  geom_point() +
  geom_smooth()
`geom_smooth()` using method = 'gam' and formula = 'y ~ s(x, bs = "cs")'

Suppressing messages

demo_message <- function() message("This is a message")
demo_message()
This is a message
suppressMessages(demo_message())
demo_print <- function() print("This is a message")
demo_print()
[1] "This is a message"
suppressMessages(demo_print())
[1] "This is a message"

What if there are no conditions?

An error, flaw, failure or fault in a computer program or system that causes it to produce an incorrect or unexpected result, or to behave in unintended ways.

  • Bugs can occur without any conditions in R
  • If the output is unintended, it is a bug

Generate a reprex

  1. Realize that you have a bug
  2. Make it repeatable
  3. Figure out where it is
  4. Fix it and test it
An animated GIF from the film 'Jerry Maguire' where Jerry (played by Tom Cruise) is pleading with an off-screen character saying 'Help me help you.'

A side-by-side comparison of a monster providing problematic code to tech support when it is on a bunch of crumpled, disorganized papers, with both monsters looking sad and very stressed (left), compared to victorious looking monsters celebrating when code is provided in a nice box with a bow labeled 'reprex'. Title text reads 'reprex: make reproducible examples. Help them help everyone!'

Reproducible examples

  • reprex (noun)
  • reprex
  • reprex::reprex()

Why reprexes?

Easier to talk about code that:

  • Actually runs
  • I don’t have to run
  • I can easily run

Prior attempts to reprex

Generate a reprex using reprex()

library(tidyverse)
count(diamonds, colour)
Error in `count()`:
! Must group by variables found in `.data`.
✖ Column `colour` is not found.

Reprex do’s and don’ts

  • Ensure the example is fully reproducible
  • Use the smallest, simplest, most built-in data possible
  • Include commands on a strict “need to run” basis
  • Consider including “session info”
  • Use good coding style to ensure the readability of your code by other human beings
  • Ensure portability of the code

Use the call stack and/or intuition to find the bug

  1. Realize that you have a bug
  2. Make it repeatable
  3. Figure out where it is
  4. Fix it and test it

The call stack

f <- function(a) g(a)
g <- function(b) h(b)
h <- function(c) i(c)
i <- function(d) {
  if (!is.numeric(d)) {
    stop("`d` must be numeric", call. = FALSE)
  }
  d + 10
}
f("a")
Error: `d` must be numeric
traceback()
5: stop("`d` must be numeric", call. = FALSE) at #3
4: i(c) at #1
3: h(b) at #1
2: g(a) at #1
1: f("a")

Slow down, simplify, and do small things

  • Run code incrementally (especially valuable for piped operations)
  • Strip out extraneous operations
  • Simplify to a minimal reproducible example

Unexpected output

library(tidyverse)

ggplot(data = mpg, mapping = aes(x = displ, y = hwy, color = drv)) +
  geom_point() +
  labs(x = "Displacement",
       y = "Highway MPG",
       color = "Drive") +
  scale_fill_viridis_d(option = "plasma", end = 0.9) +
  theme_minimal() +
  theme(legend.position = "bottom")

Expected to use the “plasma” color palette from viridis. What went wrong?

Simplify

ggplot(data = mpg, mapping = aes(x = displ, y = hwy, color = drv)) +
  geom_point()

ggplot(data = mpg, mapping = aes(x = displ, y = hwy, color = drv)) +
  geom_point() +
  scale_fill_viridis_d()

ggplot(data = mpg, mapping = aes(x = displ, y = hwy, color = drv)) +
  geom_point() +
  scale_color_viridis_d()

ggplot(data = mpg, mapping = aes(x = displ, y = hwy, color = drv)) +
  geom_point() +
  scale_color_viridis_d(option = "plasma", end = 0.9)

ggplot(data = mpg, mapping = aes(x = displ, y = hwy, color = drv)) +
  geom_point() +
  labs(x = "Displacement",
       y = "Highway MPG",
       color = "Drive") +
  scale_color_viridis_d(option = "plasma", end = 0.9) +
  theme_minimal() +
  theme(legend.position = "bottom")

Fix the bug and test it

  1. Realize that you have a bug
  2. Make it repeatable
  3. Figure out where it is
  4. Fix it and test it

Dealing with bugs

  • Unexpected errors
  • Expected errors
    • Add conditions
    • Use safely()

Dealing with failure using safely()

  • Adverb - modifies a function (verb)
  • Always returns a list with two elements
    1. result
    2. error

Dealing with failure using safely()

safe_sqrt <- safely(sqrt)
str(safe_sqrt(9))
List of 2
 $ result: num 3
 $ error : NULL
str(safe_sqrt("a"))
List of 2
 $ result: NULL
 $ error :List of 2
  ..$ message: chr "non-numeric argument to mathematical function"
  ..$ call   : language .Primitive("sqrt")(x)
  ..- attr(*, "class")= chr [1:3] "simpleError" "error" "condition"

safely() and map()

x <- list("a", 4, 5)

# unsafely square root
y <- map(x, sqrt)
Error in `map()`:
ℹ In index: 1.
Caused by error:
! non-numeric argument to mathematical function
# safely log
y <- map(x, safely(sqrt))
str(y)
List of 3
 $ :List of 2
  ..$ result: NULL
  ..$ error :List of 2
  .. ..$ message: chr "non-numeric argument to mathematical function"
  .. ..$ call   : language .Primitive("sqrt")(x)
  .. ..- attr(*, "class")= chr [1:3] "simpleError" "error" "condition"
 $ :List of 2
  ..$ result: num 2
  ..$ error : NULL
 $ :List of 2
  ..$ result: num 2.24
  ..$ error : NULL

transpose()

y <- transpose(y)
str(y)
List of 2
 $ result:List of 3
  ..$ : NULL
  ..$ : num 2
  ..$ : num 2.24
 $ error :List of 3
  ..$ :List of 2
  .. ..$ message: chr "non-numeric argument to mathematical function"
  .. ..$ call   : language .Primitive("sqrt")(x)
  .. ..- attr(*, "class")= chr [1:3] "simpleError" "error" "condition"
  ..$ : NULL
  ..$ : NULL

Application exercise

ae-14

  • Go to the course GitHub org and find your ae-14 (repo name will be suffixed with your GitHub name).
  • Clone the repo in RStudio Workbench, open the Quarto document in the repo, and follow along and complete the exercises.
  • Render, commit, and push your edits by the AE deadline – end of tomorrow

Recap

  • Debugging is a frustrating, complicated skill that you need to learn
  • Use conditions or careful observations to determine if you have a bug
  • Generate a reprex in order to make the bug repeatable
  • Use the call stack and/or intuition to track down the bug
  • Fix the bug or use condition handling to deal with it

INFO 3312/5312

Course site (from last spring)