AE 10: Scraping multiple pages of articles from the Cornell Review
Application exercise
Packages
We will use the following packages in this application exercise.
- tidyverse: For data import, wrangling, and visualization.
- rvest: For scraping HTML files.
- robotstxt: For verifying if we can scrape a website.
Data scraping
This will be done in the iterate-cornell-review.R
R script. Save the resulting data frame in the data folder.