I want to write code using tryCatch
to deal with errors downloading data from the web.
url <- c(
"http://stat.ethz.ch/R-manual/R-devel/library/base/html/connections.html",
"http://en.wikipedia.org/wiki/Xz")
y <- mapply(readLines, con=url)
These two statements run successfully. Below, I create a non-exist web address:
url <- c("xxxxx", "http://en.wikipedia.org/wiki/Xz")
url[1]
does not exist. How does one write a tryCatch
loop (function) so that:
urls <- c(
"http://stat.ethz.ch/R-manual/R-devel/library/base/html/connections.html",
"http://en.wikipedia.org/wiki/Xz",
"xxxxx"
)
readUrl <- function(url) {
tryCatch(
{
# Just to highlight: if you want to use more than one
# R expression in the "try" part then you'll have to
# use curly brackets.
# 'tryCatch()' will return the last evaluated expression
# in case the "try" part was completed successfully
message("This is the 'try' part")
suppressWarnings(readLines(url))
# The return value of `readLines()` is the actual value
# that will be returned in case there is no condition
# (e.g. warning or error).
},
error = function(cond) {
message(paste("URL does not seem to exist:", url))
message("Here's the original error message:")
message(conditionMessage(cond))
# Choose a return value in case of error
NA
},
warning = function(cond) {
message(paste("URL caused a warning:", url))
message("Here's the original warning message:")
message(conditionMessage(cond))
# Choose a return value in case of warning
NULL
},
finally = {
# NOTE:
# Here goes everything that should be executed at the end,
# regardless of success or error.
# If you want more than one expression to be executed, then you
# need to wrap them in curly brackets ({...}); otherwise you could
# just have written 'finally = <expression>'
message(paste("Processed URL:", url))
message("Some other message at the end")
}
)
}
> y <- lapply(urls, readUrl)
This is the 'try' part
Processed URL: http://stat.ethz.ch/R-manual/R-devel/library/base/html/connections.html
Some other message at the end
This is the 'try' part
Processed URL: http://en.wikipedia.org/wiki/Xz
Some other message at the end
This is the 'try' part
URL does not seem to exist: xxxxx
Here's the original error message:
cannot open the connection
Processed URL: xxxxx
Some other message at the end
> head(y[[1]])
[1] "<!DOCTYPE html><html><head><title>R: Functions to Manipulate Connections (Files, URLs, ...)</title>"
[2] "<meta http-equiv=\"Content-Type\" content=\"text/html; charset=utf-8\" />"
[3] "<meta name=\"viewport\" content=\"width=device-width, initial-scale=1.0, user-scalable=yes\" />"
[4] "<link rel=\"stylesheet\" href=\"https://cdn.jsdelivr.net/npm/katex@0.15.3/dist/katex.min.css\">"
[5] "<script type=\"text/javascript\">"
[6] "const macros = { \"\\\\R\": \"\\\\textsf{R}\", \"\\\\code\": \"\\\\texttt\"};"
> length(y)
[1] 3
> y[[3]]
[1] NA
tryCatch
tryCatch
returns the value associated to executing expr
unless there's an error or a warning. In this case, specific return values (see NA
above) can be specified by supplying a respective handler function (see arguments error
and warning
in ?tryCatch
). These can be functions that already exist, but you can also define them within tryCatch()
(as I did above).
As we've specified that NA
should be returned in case of error, the third element in y
is NA
.