I have started to learn web scraping using R. My first project is to collect a list of all cooking books from Indigo and do some analysis.
But currently, I can only select the first book from the page. I use “rvest” package and Google Chrome's selector gadget. I have watched YouTube videos and links but no one seems to have this issue, happy to get any ideas on listing all books from the page and in all available pages.
Code:
library(rvest) library(tidyverse)
indigo_page = read_html("https://www.chapters.indigo.ca/en-ca/books/top-tens/cookbooks/")
indigo_page%>% html_node(".product-list__product-title")%>% html_text()
Output:
[1] "The Comfortable Kitchen: 105 Laid-back, Healthy, And Wholesome Recipes"
Donjazz, I guess the first suggestion would be to use html_nodes(), rather than html_node(). This minor change seems to output all of the titles for you.