Search code examples
rftpgoogle-earth-engine

How can I download CHIRPS precipitation data in .gz format?


I'm trying to download CHIRPS data from ftp://ftp.chg.ucsb.edu/pub/org/chg/products/CHIRPS-2.0/africa_daily/tifs/p25/2010/. The heavyRain package is outdated and the earthEngineGrabR package (to pull data from Google's Earth Engine, https://developers.google.com/earth-engine) seems to have some bugs. Here are a few of my attempts.

lst.files <- list(
 list(
url2 = "ftp://chg-ftpout.geog.ucsb.edu/pub/org/chg/products/CHIRPS -2.0/africa_daily/tifs/p25/2010/chirps-v2.0.2010.01.01.tif.gz"
, target = "chirps-v2.0.2010.01.01.tif.gz"))

#download gzipped files (only if file does not exist)
lapply(lst.files, function(x)
 if(!file.exists(x$target)) download.file(x$url2, x$target))

#open files
lst <- lapply(lst.files, function(x) {
  df <- readr::read_table2(x$target)
  })

Here's the error message: Error in guess_header_(datasource, tokenizer, locale) : embedded nul in string: 'II*'

And here's another attempt:

library(RCurl)
library(foreign)
library(plyr)
library(dplyr)

setwd <- "C://Desktop"
url <- "ftp://chg-ftpout.geog.ucsb.edu"
years = c("2010", "2011", "2012", "2013", "2014", "2015", "2016", "2017", "2018", "2019")

lapply(years, function (x){
  url <- paste(url, "/pub/org/chg/products/CHIRPS-2.0/africa_daily/tifs/p25/", x, ".gz", sep = "")
  filename <- paste("CHIRPS", x, ".gz", sep = "")
  foldername <- paste("CHIRPS", x, sep = "")
  filename

  if (file.exists(filename)==FALSE){
    download.file(url, filename)
  }

  if (file.exists(foldername)==FALSE){
    dir.create(foldername)
  }

  if(length(list.files(path = foldername, pattern="*.gz")) == 0){
    unzip(filename)
    }

  for (fl in (list.files(pattern=".gz"))){
      file.copy(fl, foldername)

    file.remove(fl)
}})

Here's the error message: trying URL 'ftp://chg-ftpout.geog.ucsb.edu/pub/org/chg/products/CHIRPS-2.0/africa_daily/tifs/p25/2010.gz' Error in download.file(url, filename) : cannot open URL 'ftp://chg-ftpout.geog.ucsb.edu/pub/org/chg/products/CHIRPS-2.0/africa_daily/tifs/p25/2010.gz' In addition: Warning message: In download.file(url, filename) : Error in download.file(url, filename) : cannot open URL 'ftp://chg-ftpout.geog.ucsb.edu/pub/org/chg/products/CHIRPS-2.0/africa_daily/tifs/p25/2010.gz'

And here's what happens using the example provided on the github readme file for earthEngineGrabR, https://github.com/JesJehle/earthEngineGrabR:

> library(earthEngineGrabR)
> library(tidyverse)
> library(sf)
> Chirps_data <- ee_grab(data = ee_data_collection(datasetID = 'UCSB-CHG/CHIRPS/DAILY'
+                                          , spatialReducer = 'mean'
+                                          , temporalReducer = 'sum'
+                                          , timeStart = "2016-01-01"
+                                          , timeEnd = "2016-12-31"
+                                          , resolution = 200)
+                        , targetArea = system.file('data/territories.shp', package = 'earthEngineGrabR'))

Here's the output of the code where it starts, but then gets stuck:

Auto-refreshing stale OAuth token.

upload: territories is already uploaded Should the file be deleted and uploaded again? [Y/N]: Y Files deleted: * territories: 1AOc2yzIV1DGDgfUULNA6Co1M37xcWTFLRbdKOegs Creating Fusion Table: territories

Error: With the given product argument no valid data could be requested. In addition: Warning messages: 1: In (function (text) : printing of extremely long output is truncated 2: Error on Earth Engine servers for data product: UCSB-CHG-CHIRPS-DAILY_s-mean_t-sum_2016-01-01to2016-12-31 Error in py_call_impl(callable, dots$args, dots$keywords): EEException: Unexpected HTTP error: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:727)

Can anyone help me access either of these datasources?


Solution

  • I found a way to do this by adapting the code from this video, https://www.youtube.com/watch?v=EBfx1L16qlM. Using the code below, I can download all the files for a given year, and then I just repeat the code for the next year manually by adapting the url. It's not an elegant solution, but it works.

    library(RCurl)
    setwd("working directory file name")
    url <- "ftp://chg-ftpout.geog.ucsb.edu/pub/org/chg/products/CHIRPS-2.0/africa_daily/tifs/p25/2010/" 
    filenames <- getURL(url, ftp.use.epsv = FALSE, dirlistonly = TRUE)
    filenames <- strsplit(filenames, "\r\n")
    filenames = unlist(filenames)
    filenames
    
    for (filename in filenames) {
      download.file(paste(url, filename, sep = ""), paste(getwd(), "/", filename, sep = ""))
    }
    

    This downloads all the .gz files to my working directory. Faster solutions are welcome, but this does work.