I am new to web-scraping, and have been attempting to collect information using RSelenium (as an alternative to rvest), as some of the websites I am interested in use JavaScript. However, when I use the below code, the findElement() function returns nothing.
library(RSelenium)
driver <- rsDriver(browser=c("chrome"), chromever="81.0.4044.138")
remote_driver <- driver$client
remote_driver$navigate("https://www.gucci.com/uk/en_gb/ca/decor-c-decor")
p <- remote_driver$findElement(using = "xpath", "//span[@class = 'sale']")
product <- p$getElementText()
product
The xpath appears to be correct, any ideas?
I'm not sure if this is the best way to do it, but you can use RSelenium to get the page source (including Javascript elements) and then use rvest to extract those elements.
library(dplyr)
library(rvest)
elemrvest <- remote_driver$getPageSource()[[1]]
df <- tibble(Products = elemrvest %>%
read_html() %>%
html_nodes(xpath = "//div[@class = 'product-tiles-grid-item-info']/h2") %>%
html_text(),
Prices = elemrvest %>%
read_html() %>%
html_nodes(xpath = "//span[@class = 'sale']") %>%
html_text())