Search code examples
rrselenium

Scrape RSelenium site with tabs where Inspector Gadget fails


This site has a table (Guaranteed Investment Certificate - Long-Term and Compound Interest) that appears after clicking a tab (Non-cashable GICs).

My plan is to find the tab by id click on it and then grab the HTML source. Then, I usually use read_html and html_nodes to get closer to the items I'm searching for. In this case, the rates for Non-registered and Registered (TFSA, RSP, RIF, RESP).

However, Inspector Gadget with Chrome freezes on the site so I am unsure as to what css selector to use. Any ideas on how to get the rates for the Guaranteed Investment Certificate - Long-Term and Compound Interest table?

# TD GIC scrape - FAIL
remDr$navigate("https://www.td.com/ca/en/personal-banking/products/saving-investing/gic-rates-canada/")

# Find element, click element and then get source
webElem <- remDr$findElement(using = "id", "Tab_non-cashable")
webElem$clickElement()
html <- remDr$getPageSource()[[1]]

read_html(html) %>% # parse HTML
  html_nodes("td-complex-chart") 

# {xml_nodeset (0)}

Solution

  • Below my solution:

    library(Rselenium)
    driver <- rsDriver(browser=c("firefox"), port = 4567L)
    remote_driver <- driver[["client"]]
    remote_driver$navigate("https://www.td.com/ca/en/personal-banking/products/saving-investing/gic-rates-canada/")
    webElem <- remote_driver$findElement(using = "xpath", '//*[@id="Tab_non-cashable"]')
    webElem$clickElement()
    

    and after you can take your table.

    Below an idea about how to take the Guaranteed Investment Certificate - Long-Term and Simple Interest

    webElem <- remote_driver$findElement(using = "css selector", 'section.ng-scope:nth-child(7) > div:nth-child(1)')
    webElem$getElementText()
    [[1]]
    [1] "Guaranteed Investment Certificate - Long-Term and Simple Interest\nTerm\nNon-registered and Registered (TFSA, RSP, RIF, RESP)\n1 year\n0.45%\n2 years\n0.50%\n3 years\n0.60%\n4 years\n0.70%\n5 years\n0.85%"
    

    When you can I suggest you to use the xpath to individual the part that you need.