Search code examples
rweb-scrapinginternal-server-errorrvest

"Rescue" command in R?


I have this code:

library(rvest)
url_list <- c("https://github.com/rails/rails/pull/100", 
               "https://github.com/rails/rails/pull/200", 
               "https://github.com/rails/rails/pull/300")

mine <- function(url){
  url_content  <- html(url)
  url_mainnode <- html_node(url_content, "*")
  url_mainnode_text <- html_text(url_mainnode)
  url_mainnode_text <- gsub("\n", "", url_mainnode_text) # clean up the text
  url_mainnode_text
}

messages <- lapply(url_list, mine)

However, as i make the list longer I tend to run into a

Error in html.response(r, encoding = encoding) : 
  server error: (500) Internal Server Error 

I know in Ruby I can use rescue to keep iterating through a list, even though some attempts at applying a function fails. Is there something similar in R?


Solution

  • One option is to use try(). For more info, see here. Here's an implementation:

    library(rvest)
    url_list <- c("https://github.com/rails/rails/pull/100", 
                   "https://github.com/rails/rails/pull/200", 
                   "https://github.com/rails/rails/pull/300")
    
    mine <- function(url){
      try(url_content  <- html(url))
      url_mainnode <- html_node(url_content, "*")
      url_mainnode_text <- html_text(url_mainnode)
      url_mainnode_text <- gsub("\n", "", url_mainnode_text) # clean up the text
      url_mainnode_text
    }
    
    messages <- lapply(url_list, mine)