Tool calling

Lecture 24

Dr. Benjamin Soltoff

Cornell University
INFO 5001 - Fall 2025

November 20, 2025

Announcements

Announcements

  • Project 01 draft
  • Quiz 03 on Friday

Learning objectives

  • Identify the value of calling external tools from LLMs
  • Define and call tools using {ellmer}
  • Implement tools that interact with Shiny apps
  • Create natural language interfaces to data with {querychat}

Application exercise

ae-22

Instructions

  • Go to the course GitHub org and find your ae-22 (repo name will be suffixed with your GitHub name).
  • Clone the repo in Positron, run renv::restore() to install the required packages, open the Quarto document in the repo, and follow along and complete the exercises.
  • Render, commit, and push your edits by the AE deadline – end of the day

🔓 Decrypt the .Renviron.secret.Renviron

  1. Run secret.R
  2. The special phrase is:
    info-5001

Recall: How do LLMs work?

  1. You write some words

  2. The LLM writes some more words

  3. You use those words

On their own, can LLMs… access the internet? send an email? interact with the world?

Let’s try it

library(ellmer)

chat <- chat("openai/gpt-4.1-nano")

chat$chat("What's the weather like in Ithaca, NY?")
chat$chat("What day is it?")

Tools

a.k.a. functions, tool calling or function calling

  • Bring real-time or up-to-date information to the model

  • Let the model interact with the world

Chatbot Systems

How do tool calls work?

What should I wear to campus tomorrow?

Human in the loop

👨‍💻 _demos/18_manual-tools/app.R

Wait… I can write code!

library(ellmer)

ellmer::create_tool_def(weathR::point_forecast, verbose = TRUE)
tool(
  weathR::point_forecast,
  name = "point_forecast",
  description = "Retrieve point forecast meteorological data for a given location.",
  arguments = list(
    lat = type_number("Latitude of the location."),
    lon = type_number("Longitude of the location."),
    timezone = type_string("The nominal timezone for the forecast. Either an Olson timezone name or
'-1' for local time. Defaults to '-1'.", required = FALSE),
    dir_numeric = type_boolean("If TRUE, wind directions are returned as numeric degrees; if FALSE 
(default), as character values.", required = FALSE)
  )
)

Wait… I can write code!

get_weather <- tool(
  \(lat, lon) weathR::point_forecast(lat, lon),
  name = "point_forecast",
  description = "Get forecast data for a specific latitude and longitude.",
  arguments = list(
    lat = type_number("Latitude of the location."),
    lon = type_number("Longitude of the location.")
  )
)

Wait… I can write code!

get_weather(lat = 42.4397, lon = -76.4953)
Simple feature collection with 156 features and 8 fields
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: -76.4953 ymin: 42.4397 xmax: -76.4953 ymax: 42.4397
Geodetic CRS:  WGS 84
First 10 features:
                      time temp   dewpoint humidity p_rain wind_speed wind_dir         skies
1  2025-11-19 10:00:00 EST   39  0.0000000       76      0          1        N         Sunny
2  2025-11-19 11:00:00 EST   42  0.0000000       67      0          2        N         Sunny
3  2025-11-19 12:00:00 EST   43  0.0000000       65      0          2       NW  Mostly Sunny
4  2025-11-19 13:00:00 EST   44  0.0000000       62      0          5       NW  Mostly Sunny
5  2025-11-19 14:00:00 EST   45  0.0000000       60      0          5       NW  Mostly Sunny
6  2025-11-19 15:00:00 EST   43  0.0000000       65      0          5       NW  Mostly Sunny
7  2025-11-19 16:00:00 EST   42  0.0000000       67      0          3       NW         Sunny
8  2025-11-19 17:00:00 EST   40 -0.5555556       70      0          2        N  Mostly Sunny
9  2025-11-19 18:00:00 EST   36 -1.1111111       79      0          2       NE Partly Cloudy
10 2025-11-19 19:00:00 EST   34 -1.1111111       85      0          1       NE Partly Cloudy
                   geometry
1  POINT (-76.4953 42.4397)
2  POINT (-76.4953 42.4397)
3  POINT (-76.4953 42.4397)
4  POINT (-76.4953 42.4397)
5  POINT (-76.4953 42.4397)
6  POINT (-76.4953 42.4397)
7  POINT (-76.4953 42.4397)
8  POINT (-76.4953 42.4397)
9  POINT (-76.4953 42.4397)
10 POINT (-76.4953 42.4397)

Wait… I can write code!

chat <- chat_openai(model = "gpt-4.1-nano", echo = "output")

# Register the tool with the chatbot
chat$register_tool(get_weather)

chat$chat("What should I wear to class tomorrow in Ithaca, NY?")
◯ [tool call] point_forecast(lat = 42.4534, lon = -76.4862)
#> [{"time":"2025-11-10 11:00:00 EST","temp":33,"dewpoint":-1.1111,"humidity":89,"p_rain":57,"wind_speed":…
#> The weather forecast for tomorrow in Ithaca, NY indicates cold temperatures around the low 30s to high 20s 
#> Fahrenheit, with a high chance of snow showers and rain showers throughout the day. It will be windy with wind 
#> speeds around 8-16 mph.
#> 
#> Given these conditions, I recommend wearing warm and waterproof clothing. A good outfit would include:
#> - A warm, insulated waterproof coat
#> - Layers like thermal shirts and sweaters
#> - Waterproof gloves and a hat
#> - Waterproof boots to keep your feet dry and warm
#> - An umbrella might also be useful in case of rain
#> 
#> Make sure to stay warm and dry while attending your class!

Recap: Tool definitions in R

tool_get_weather <- tool(
  tool_fn,
  description = "How and when to use the tool",
  arguments = list(
    .... = type_string(),
    .... = type_integer(),
    .... = type_enum(
      c("choice1", "choice2"),
      required = FALSE
    )
  )
)

How it works

  • Natural language chat powered by LLMs

  • Do not provide the LLM direct access to raw data

  • It can only read or filter data by writing SQL SELECT statements

    • Reliability
    • Transparency
    • Reproducibility
  • Leverages DuckDB for its SQL engine

querychat in R

library(shiny)
library(bslib)
library(querychat)

penguins_qc_config <- querychat_init(penguins)

ui <- page_sidebar(
  sidebar = querychat_sidebar("penguins"),
  # plots, tables, etc.
)

server <- function(input, output, session) {
  penguins_qc <- querychat_server("penguins", penguins_qc_config)

  output$table <- renderTable({
    penguins_qc$df()
  })
}

shinyApp(ui, server)

⌨️ 20_querychat

Instructions

  1. I’ve made a Shiny dashboard to explore Airbnb listings in Asheville, NC.

    • Spend 1-2 min: which Neighborhood has most private rooms?
  2. Work through the steps in the comments to use {querychat}.

  3. Spend a few minutes exploring the data and chatting with the app.
    Which area has the most private rooms?

08:00

Enhancing querychat

  • Provide an explicit user greeting
  • Augment the system prompt
    • Data description
    • Custom greeting
  • Use a different LLM provider/model

MCP

Recall: Tools

a.k.a. functions, tool calling or function calling

  • Bring real-time or up-to-date information to the model

  • Let the model interact with the world

Who should write tools for GitHub?

MCP solves this problem! GitHub writes tools…

and lets you your models use them.

GitHub MCP Server

Implementing MCP in R

{mcptools}

  • R as an MCP client
  • R as an MCP server

Wrap-up

Recap

  • Tool calling lets LLMs access real-time information and interact with the world
  • We can define tools in R and Python using {ellmer} and chatlas
  • Leverage LLMs to create natural language interfaces to data with querychat
  • The Model Context Protocol (MCP) standardizes how LLMs can access tools

Acknowledgments