Search code examples
javascriptrweb-scrapingrselenium

Web scraping with RSelenium: findElement returning nothing


I am new to web-scraping, and have been attempting to collect information using RSelenium (as an alternative to rvest), as some of the websites I am interested in use JavaScript. However, when I use the below code, the findElement() function returns nothing.

library(RSelenium)

driver <- rsDriver(browser=c("chrome"), chromever="81.0.4044.138")

remote_driver <- driver$client

remote_driver$navigate("https://www.gucci.com/uk/en_gb/ca/decor-c-decor")

p <- remote_driver$findElement(using = "xpath", "//span[@class = 'sale']")
product <- p$getElementText()
product

The xpath appears to be correct, any ideas?


Solution

  • I'm not sure if this is the best way to do it, but you can use RSelenium to get the page source (including Javascript elements) and then use rvest to extract those elements.

    library(dplyr)
    library(rvest)
    
    elemrvest <- remote_driver$getPageSource()[[1]]
    
    df <- tibble(Products = elemrvest %>% 
                   read_html() %>% 
                   html_nodes(xpath = "//div[@class = 'product-tiles-grid-item-info']/h2") %>% 
                   html_text(),
                 Prices = elemrvest %>% 
                   read_html() %>% 
                   html_nodes(xpath = "//span[@class = 'sale']") %>% 
                   html_text())