Search code examples
rhttrrcurl

Downloading CSV as plain/text from Website


I am trying to automate the download of a dataset from a website but am having trouble getting what I want. I have tried using RCurl but it is getting stuck with a tlsv1 alert protocol version error. I can execute the download with httr, but what I receive is the file in plain/html, which obviously isn't what I want. I have tried a handful of other things, but nothing seems to be working. Please advise.

Code for downloading with httr:

###lung cancer screening locator tool url
url1 = "https://report.acr.org/#/site/PUBLIC/views/NRDRLCSLocator/ADownload.csv"

GET(url1, write_disk(tf <- tempfile(fileext = ".csv"))) #produces file of content type 'plain/html'

lcsr = read.csv(tf)

The original website for this request is https://www.acr.org/Clinical-Resources/Lung-Cancer-Screening-Resources/LCS-Locator-Tool and the Tableau behind it is located at https://report.acr.org/t/PUBLIC/views/NRDRLCSLocator/LCSLocator?:embed=y&:showVizHome=no&:host_url=https%3A%2F%2Freport.acr.org%2F&:embed_code_version=3&:tabs=no&:toolbar=no&:showAppBanner=no&:display_spinner=no&:loadOrderID=0


Solution

  • A RSelenium solution,

    Set the download directory as per this,

    library(RSelenium)
    
    #Setting download directory, 
    eCaps <- list(
      chromeOptions = 
        list(prefs = list('download.default_directory' = "D:\\mywork"))
    )
    driver <- rsDriver(browser = "chrome", extraCapabilities = eCaps)
    remDr <- driver[["client"]]
    remDr$navigate("https://report.acr.org/#/site/PUBLIC/views/NRDRLCSLocator/ADownload.csv")
    library(readr)
    df = read_csv('ADownload.csv')