Search code examples
rexceptiontry-catchr-faq

How to use the tryCatch() function?


I want to write code using tryCatch to deal with errors downloading data from the web.

url <- c(
    "http://stat.ethz.ch/R-manual/R-devel/library/base/html/connections.html",
    "http://en.wikipedia.org/wiki/Xz")
y <- mapply(readLines, con=url)

These two statements run successfully. Below, I create a non-exist web address:

url <- c("xxxxx", "http://en.wikipedia.org/wiki/Xz")

url[1] does not exist. How does one write a tryCatch loop (function) so that:

  1. When the URL is wrong, the output will be: "web URL is wrong, can't get".
  2. When the URL is wrong, the code does not stop, but continues to download until the end of the list of URLs?

Solution

  • Setting up the code

    urls <- c(
        "http://stat.ethz.ch/R-manual/R-devel/library/base/html/connections.html",
        "http://en.wikipedia.org/wiki/Xz",
        "xxxxx"
    )
    
    readUrl <- function(url) {
        tryCatch(
            {
                # Just to highlight: if you want to use more than one
                # R expression in the "try" part then you'll have to
                # use curly brackets.
                # 'tryCatch()' will return the last evaluated expression
                # in case the "try" part was completed successfully
    
                message("This is the 'try' part")
    
                suppressWarnings(readLines(url))
                # The return value of `readLines()` is the actual value
                # that will be returned in case there is no condition
                # (e.g. warning or error).
            },
            error = function(cond) {
                message(paste("URL does not seem to exist:", url))
                message("Here's the original error message:")
                message(conditionMessage(cond))
                # Choose a return value in case of error
                NA
            },
            warning = function(cond) {
                message(paste("URL caused a warning:", url))
                message("Here's the original warning message:")
                message(conditionMessage(cond))
                # Choose a return value in case of warning
                NULL
            },
            finally = {
                # NOTE:
                # Here goes everything that should be executed at the end,
                # regardless of success or error.
                # If you want more than one expression to be executed, then you
                # need to wrap them in curly brackets ({...}); otherwise you could
                # just have written 'finally = <expression>' 
                message(paste("Processed URL:", url))
                message("Some other message at the end")
            }
        )
    }
    

    Using the code

    > y <- lapply(urls, readUrl)
    This is the 'try' part
    Processed URL: http://stat.ethz.ch/R-manual/R-devel/library/base/html/connections.html
    Some other message at the end
    This is the 'try' part
    Processed URL: http://en.wikipedia.org/wiki/Xz
    Some other message at the end
    This is the 'try' part
    URL does not seem to exist: xxxxx
    Here's the original error message:
    cannot open the connection
    Processed URL: xxxxx
    Some other message at the end
    

    Investigating the output

    > head(y[[1]])
    [1] "<!DOCTYPE html><html><head><title>R: Functions to Manipulate Connections (Files, URLs, ...)</title>"
    [2] "<meta http-equiv=\"Content-Type\" content=\"text/html; charset=utf-8\" />"
    [3] "<meta name=\"viewport\" content=\"width=device-width, initial-scale=1.0, user-scalable=yes\" />"
    [4] "<link rel=\"stylesheet\" href=\"https://cdn.jsdelivr.net/npm/katex@0.15.3/dist/katex.min.css\">"
    [5] "<script type=\"text/javascript\">"
    [6] "const macros = { \"\\\\R\": \"\\\\textsf{R}\", \"\\\\code\": \"\\\\texttt\"};"
    
    > length(y)
    [1] 3
    
    > y[[3]]
    [1] NA
    

    Additional remarks

    tryCatch

    tryCatch returns the value associated to executing expr unless there's an error or a warning. In this case, specific return values (see NA above) can be specified by supplying a respective handler function (see arguments error and warning in ?tryCatch). These can be functions that already exist, but you can also define them within tryCatch() (as I did above).

    The implications of choosing specific return values of the handler functions

    As we've specified that NA should be returned in case of error, the third element in y is NA.