<- logistic_reg() |>
lr_mod set_engine(engine = "glm") |>
set_mode("classification")
lr_mod
Predicting children in hotel bookings
Application exercise
Your Turn 1
Run the chunk below and look at the output. Then, copy/paste the code and edit to create:
a decision tree model for classification
that uses the
C5.0
engine.
Save it as tree_mod
and look at the object. What is different about the output?
Hint: you’ll need https://www.tidymodels.org/find/parsnip/
Your Turn 2
Fill in the blanks.
Use initial_split()
, training()
, and testing()
to:
Split hotels into training and test sets. Save the rsplit!
Extract the training data and fit your classification tree model.
Check the proportions of the
test
variable in each set.
Keep set.seed(100)
at the start of your code.
Hint: Be sure to remove every _
before running the code!
set.seed(100) # Important!
<- ________(hotels, prop = 3 / 4)
hotels_split <- ________(hotels_split)
hotels_train <- ________(hotels_split)
hotels_test
# check distribution
count(x = hotels_train, children) |>
mutate(prop = n / sum(n))
count(x = hotels_test, children) |>
mutate(prop = n / sum(n))
Your Turn 3
Run the code below. What does it return?
set.seed(100)
<- vfold_cv(data = hotels_train, v = 10)
hotels_folds hotels_folds
Your Turn 4
Add a autoplot()
to visualize the ROC AUC.
<- tree_mod |>
tree_preds fit_resamples(
~ average_daily_rate + stays_in_weekend_nights,
children resamples = hotels_folds,
control = control_resamples(save_pred = TRUE)
)
|>
tree_preds collect_predictions() |>
roc_curve(truth = children, .pred_children) |>
________()
Acknowledgments
- Materials derived from Tidymodels, Virtually: An Introduction to Machine Learning with Tidymodels by Allison Hill.
- Dataset and some modeling steps derived from A predictive modeling case study and licensed under a Creative Commons Attribution-ShareAlike 4.0 International (CC BY-SA) License.