Search code examples
rjsoncurlrcurl

Download & decompress JSON file in R using curl or RCurl


I have the following bash script to download & decompress a JSON file:

#!/bin/sh -ex

# Ensure data directory (or a link) exists.
test -e results || mkdir results

# Download and decompress data.
curl -u $GISAID_USERNAME:$GISAID_PASSWORD --retry 4 \
  https://www.epicov.org/epi3/3p/$GISAID_FEED/export/provision.json.xz \
  | xz -d -T8 > results/gisaid.json

Ideally I would like to have an R function to download & decompress this file in a given directory, with the environment variables above $GISAID_USERNAME, $GISAID_PASSWORD & $GISAID_FEED passed as arguments. Would anyone know how to accomplish this, e.g. using package curl or RCurl? (It would also be OK not to decompress it and leave it as .json.xz, as I would be reading the file later using

library(jsonlite)
GISAID_json <- jsonlite::stream_in(gzfile(".//data//GISAID_json//provision.json.xz"))

Solution

  • Something like this should work:

    library(curl)
    library(glue)
    
    custom_curl <- function(user, pass, feed, dest) {
      custom_handle <- curl::new_handle()
      curl::handle_setopt(
        custom_handle,
        username = user,
        password = pass
      )
      
      url <- glue::glue(
        "https://www.epicov.org/epi3/3p/{feed}/export/provision.json.xz"
      )
      
      curl::curl_download(url, dest, handle = custom_handle)
    }
    
    custom_curl('my_user', 'xxxxxx', 'feed1', 'dest/filename.json.xz')
    

    As I can't test in the real files and url, I'm not sure if little tinkering in the function is needed, but at least is a starter point for you.