I want to download the following link in R keeping the file name after browser download.
http://www.coingecko.com/price_charts/export/1/usd.xls
Notice that the file name will be 'btc-usd-max.xls' if you download the file through browser. However, if I run the following code, the file name will be 'usd.xls'. However, the file name 'btc-usd-max.xls' conveys relevant information. Is it possible to retrieve this 'btc-usd-max' name using R?
link <- 'http://www.coingecko.com/price_charts/export/1/usd.xls'
download.file(url=link, destfile=basename(link), method='auto')
The solution provided in the following topic does not solve the problem. R download.file to a folder keeping original name
The aforementioned solution depends on content-disposition
, which does not exists.
library(httr)
hd <- HEAD(link)
filename <- gsub(".*name=", "", headers(hd)$`content-disposition`)
I have tried using HEAD() command from httr package and also downloading the files using the download.file function.
The displayed file name is in a response header called content-disposition
, but only if you get a status 200 from the server. The default user-agent in httr causes the server to return a status 403 that does not return this header, so you need to add a user agent that the server will respond to.
Once this is done it's fairly easy to extract the file name.
library(httr)
url <- 'http://www.coingecko.com/price_charts/export/1/usd.xls'
UA <- "Mozilla/5.0 (Windows NT 6.1; rv:85.0) Gecko/20100101 Firefox/85.0"
res <- GET(url, user_agent(UA))
filename <- strsplit(res$headers$`content-disposition`, '\"')[[1]][2]
Now we have
filename
#> [1] "btc-usd-max.xls"
And we can write the result like this:
writeBin(res$content, filename)
Strangely enough though, the actual binary data appears to be in csv format rather than xls format (I get a warning when I try to open it in Excel, whether downloading in R or directly in the browser). You might prefer to read it as a csv using:
data <- read.csv(text = content(res, 'text'))
Then write data
to an Excel sheet named according to the filename.