Search code examples
rloopsweb-scrapingrselenium

RSelenium: Load a page by clicking on load more button: stop the loop if the webelement is gone


I want to scrape the following page with RSelenium: https://www.letempsarchives.ch/recherche?q=%22Willy+Spuehler%22#

Before I can scrape the page I need to load all the search results on the webpage. In order to do so I have to click on the "Voir plus des résultats..."-button until the button is not visible anymore and all the search results are loaded. I tried to write a while-loop as follows:

library(RSelenium)
rD <- rsDriver(browser = "chrome", chromever = "87.0.4280.88", port = 4568L)
url <- "https://www.letempsarchives.ch/recherche?q=%22Willy+Spuehler%22#"
remote_driver <- rD[["client"]] 
remote_driver$navigate(url)

chk <- FALSE
  while(!chk){
    loadmore <- remote_driver$findElement("xpath", "//*[@class='ui fluid button huge loadMore']")
    if(length(loadmore) > 0L){
      loadmore$clickElement()
      Sys.sleep(5)
    }else
      chk <- TRUE
  }

This way I get all the search results loaded but the loop stops with the following error:

Selenium message:stale element reference: element is not attached to the page document

The loadmore-button disapears after every search results is loaded which is exactly what I need but I also need the loop to stop without an error that later my code can continue. I got the idea from here. Any help is highly appreciated.

EDIT: I made a mistake in the code that I copied in here and changed it now. (Sorry for that!) But I still get the same mistake..


Solution

  • I was able to get my code to run without stopping with the help of this post and this post. For some reason if the element is not displayed isElementDisplayed() gives me an error instead of FALSE. But with tryCatch() and suppressMessages() it runs. It is for sure not the most elegant solution but it works.

      tryCatch({
        Sys.sleep(5)
        suppressMessages({
          loadmore <- remote_driver$findElement("xpath", "//*[@class='ui fluid button huge loadMore']")
          while(loadmore$isElementDisplayed()[[1]]){
            loadmore$clickElement()  
            Sys.sleep(10)
            loadmore <- remote_driver$findElement("xpath", "//*[@class='ui fluid button huge loadMore']")
            
          }
        })
      }, 
      error = function(e) {
        NA_character_
      }
      )