Search code examples
rdataframeerror-handlingtry-catchdata-retrieval

How to use Trycatch to skip errors in data downloading in R


I am trying to download data from the USGS website using the dataRetrieval package of R.

For that purpose, I have generated a function called getstreamflow in R that works fine when I ran for example.

siteNumber <- c("094985005","09498501","09489500","09489499","09498502")
Streamflow = getstreamflow(siteNumber)

The output of the function is a list of data frames enter image description here

I could run the function when there is no issue downloading the data, but for some stations, I got the following error:

Request failed [404]. Retrying in 1.1 seconds...
Request failed [404]. Retrying in 3.3 seconds...
For: https://waterservices.usgs.gov/nwis/site/?siteOutput=Expanded&format=rdb&site=0946666666

To avoid that the function stops when encounters an error, I am trying to use tryCatch as in the following code:

Streamflow = tryCatch(
  expr = {
    getstreamflow(siteNumber)
  }, 
  error = function(e) {
  message(paste(siteNumber," there was an error"))
})

I want the function to skip the station and go to the next when encountering an error. Currently, the output I got is the one presented below, that obviously is wrong, because it says that for all the stations there was an error:

094985005 there was an error09498501 there was an error09489500 there was an error09489499 there was an error09498502 there was an error09511300 there was an error09498400 there was an error09498500 there was an error09489700 there was an error09500500 there was an error09489082 there was an error09510200 there was an error09489100 there was an error09490500 there was an error09510180 there was an error09494000 there was an error09490000 there was an error09489086 there was an error09489089 there was an error09489200 there was an error09489078 there was an error09510170 there was an error09493500 there was an error09493000 there was an error09498503 there was an error09497500 there was an error09510000 there was an error09509502 there was an error09509500 there was an error09492400 there was an error09492500 there was an error09497980 there was an error09497850 there was an error09492000 there was an error09497800 there was an error09510150 there was an error09499500 there was an error... <truncated>

What I am doing wrong using the tryCatch?


Solution

  • Answer

    You wrote the tryCatch outside of getstreamflow. Hence, if one site fails, then getstreamflow will return an error and nothing else. You should either supply 1 site at a time, or put the tryCatch inside getstreamflow.

    Example

    x <- 1:5
    fun <- function(x) {
      for (i in x) if (i == 5) stop("ERROR")
      return(x^2)
    }
    
    tryCatch(fun(x), error = function(e) paste0("wrong", x))
    

    This returns:

    [1] "wrong1" "wrong2" "wrong3" "wrong4" "wrong5"

    Multiple arguments

    You indicated that you have both siteNumber and datatype to iterate over.

    Using Map, we can define a function that takes two inputs:

    Map(function(x, y) tryCatch(fun(x, y), 
                                error = function(e) message(paste(x, " there was an error"))), 
        x = siteNumber, 
        y = datatype)
    

    Using a for-loop, we can just iterate over them:

    Streamflow <- vector(mode = "list", length = length(siteNumber))
    for (i in seq_along(siteNumber)) {
      Streamflow[[i]] <- tryCatch(getstreamflow(siteNumber[i], datatype), error = function(e) message(paste(x, " there was an error")))
    }
    

    Or, as suggested, just modify getstreamflow.