Search code examples
rtry-catchrcurl

How to check If file exists in the url before use download.file in R


I have a problem and I don't know how to solve it. I have a list of url direct to download some files.

For example.

x<-list("https://www.ecb.europa.eu/paym/coll/assets/html/dla/ea_MID/ea_csv_200219.csv",
"http://sdw.ecb.europa.eu/quickviewexport.do?SERIES_KEY=120.EXR.M.USD.EUR.SP00.A&type=csv")

name_file<-list("name_1.csv","name_2.csv")

In this case the script below works fine, but if one o more urls don't work the tryCatch doesen't return me the message. Somebody, please, could help me and tell me where is my mistake?

  for(i in seq_along(x)) {
  x<-as.character(x[i])
  nse.folder = paste0("directory_files/",name_file[i])
  tryCatch({download.file(x, destfile = nse.folder, method='curl')}, error = function(e) "Error: this url doesn't work!")
  Sys.sleep(4)
  }

To test the script I cut, for example the url, like this:

x<-list("https://www.ecb.europa.eu/paym/coll/assets/html/dla/ea_MID/",
"http://sdw.ecb.europa.eu/quickviewexport.do?")

Where should I improve the code ?

Thank you in advance


Solution

  • You can use the HEAD request. In R it's available in package httr. The return codes can be found on the Wikipedia. This SO post may be useful.

    A very simple function could be

    urlFileExist <- function(url){
      HTTP_STATUS_OK <- 200
      hd <- httr::HEAD(url)
      status <- hd$all_headers[[1]]$status
      list(exists = status == HTTP_STATUS_OK, status = status)
    }
    
    lapply(x, urlFileExist)
    #[[1]]
    #[[1]]$exists
    #[1] TRUE
    #
    #[[1]]$status
    #[1] 200
    #
    #
    #[[2]]
    #[[2]]$exists
    #[1] TRUE
    #
    #[[2]]$status
    #[1] 200