Search code examples
rweb-scrapingrselenium

Unable to scrape data from this site (Using R)


I can't seem to determine the correct css selectors to use with RSelenium to return any data. The site is: https://www.rbcroyalbank.com/investments/gic-rates.html

The data required are the Non-Redeemable GIC rates, Interest Paid at Annually (the second column) for the years: 1,2,3,4,5, 7, 10

Some Failed Efforts

library("RSelenium")
library("rvest")
library("httr")
library("tidyverse")

remDr$navigate("https://www.rbcroyalbank.com/investments/gic-rates.html")
webElem <- remDr$findElement(using = "css selector", value = "tr:nth-child(7) .text-center:nth-child(2) div")


# OR

pg <- remDr$getPageSource()[[1]]
df <- tibble(Rates = pg %>% 
               read_html() %>% 
               html_nodes(xpath = '//tr[(((count(preceding-sibling::*) + 1) = 6) and parent::*)]//*[contains(concat( " ", @class, " " ), concat( " ", "text-center", " " )) and (((count(preceding-sibling::*) + 1) = 2) and parent::*)]//div') %>% 
               html_text())

Solution

  • Below a possibile solution.

    #Library to scrape the infomration Version 1.7.7 (mandatory)
    library(RSelenium) 
    driver <- rsDriver(browser=c("firefox"), port = 4567L)
    
    #Defines the client part.
    remote_driver <- driver[["client"]]
    remote_driver$navigate("https://www.rbcroyalbank.com/investments/gic-rates.html")
    webElem <- remote_driver$findElement(using = "css selector", value = "#gic-nrg")$clickElement()
    x<-remote_driver$findElement(using = "css selector", value = "#guaranteed-return-1 > div:nth-child(1) > table:nth-child(1)")
    df<-read.table(text=gsub(' ', '\n', x$getElementText()), header=TRUE)
    df[c(-1:-46),]