Search code examples
rweb-scrapingtext-mining

How skip some line in R


I have many URLs which I import their text in R. I use this code:

setNames(lapply(1:1000, function(x) gettxt(get(paste0("url", x)))), paste0("url", 1:1000, "_txt")) %>% 
  list2env(envir = globalenv())

However, some URLs can not import and show this error:

Error in file(con, "r") : cannot open the connection In addition: Warning message: In file(con, "r") : InternetOpenUrl failed: 'A connection with the server could not be established'

So, my code doesn't run and doesn't import any text from any URL. How can I recognize wrong URLs and skip them in other to import correct URLs?


Solution

  • one possible aproach besides trycatch mentioned by @tester can be the purrr-package:

    library(purrr)
    # declare function
    my_gettxt <- function(x) {
        gettxt(get(paste0("url", x)))
    }
    # make function error prone by defining the otherwise value (could be empty df with column defintion, etc.) used as output if function fails
    my_gettxt <- purrr::possibly(my_gettxt , otherwise = NA)
    # use map from purrr instead of apply function
    my_data <- purrr::map(1:1000, ~my_gettxt(.x))