Search code examples
rseleniumweb-scrapingrseleniumgetattribute

RSelenium: Issue with extracting link from website


I am struggling to extract data from a website with RSelenium and would be grateful for any hint.

The site's address is "https://www.airbank.cz/mapa-pobocek-a-bankomatu/brno-netroufalky-c-p-770/". The data/link i want to extract is:

<a class="flex items-center" href="https://www.google.com/maps/dir//50.659742,+14.039068/@50.659742,14.039068,16z/">

Below my code:

library(RSelenium)
library(tidyverse)

rD <- rsDriver(browser="firefox", port=483L, verbose=F)
remDr <- rD[["client"]]
x <- "https://www.airbank.cz/mapa-pobocek-a-bankomatu/brno-netroufalky-c-p-770/"

remDr$navigate(x)
Sys.sleep(5) # give the page time to fully load
current_url <- remDr$getCurrentUrl()
current_url
remDr$getStatus()
page_source <- remDr$getPageSource()[[1]]
class(page_source)
Sys.sleep(5) # give the page time to fully load

link_google <- page_source %>%
xml2::read_html() %>%
rvest::html_elements("a") %>%
rvest::html_attr("href")

str_subset(link_google, "dir")
character(0)

I am not sure why don't get the desired result (but other links). My suspicions is that it is related to the presence of an iframe, but I couldn't really figure it out.

When checking the raw result of page_source <- remDr$getPageSource()[[1]] I actually can't find the link in question. However, when inspecting the site in my browser, the link is present.


Solution

  • To extract the href attribute i.e. https://www.google.com/maps/dir//50.659742,+14.039068/@50.659742,14.039068,16z/ you can use the getElementAttribute method and you can use either of the following locator strategies:

    • Using css selector:

      element <- remDr$findElement(using = "css selector", "a.flex.items-center[href]")
      element$getElementAttribute("href")
      
    • Using xpath:

      element <- remDr$findElement(using = "xpath", "//a[@class='flexitems-center' and @href]")
      element$getElementAttribute("href")
      

    Reference

    RSelenium