I am trying to scrape the weather table in this website (https://www.timeanddate.com/weather/canada/vancouver/historic?month=10&year=2017) for all days in October. I was successful to scrape the first day of October by the following code
library("rvest")
content<-read_html("https://www.timeanddate.com/weather/canada/vancouver/historic?month=10&year=2017")
tables <- content %>% html_table(fill = TRUE)
tables[[2]]
I get the values that need to be changed every time in a drop-down menu to generate a new table corresponding to October 2,3,...
content %>%
html_nodes("#wt-his-select option")%>% html_attrs()
From similar questions, I understand that I need to use httr:POST or submit a form, but from here I have no clue how to get tables corresponding to oct 2,3,4,....
I tried this as well but seems like the drop-down menu I am trying to select options from is not a form as it does not show up here
html_form(content)
Furthermore, I cannot use "RSelenium" as I got an error (can't execute rsDriver (connection refused)) and to resolve that, I need to install Decker which I cannot for now due to windows problems. Any help would be greatly appreciated!
Follow the Network tab in the Dev tool, you can notice the page sends the request to a URL similar to this: https://www.timeanddate.com/scripts/cityajax.php?n=canada/vancouver&mode=historic&hd=20171011&month=10&year=2017&json=1
You can use jsonlite
to extract the data from it.