Search code examples
razureazure-machine-learning-service

R web scraping in Azure ML errors out


I have written a script in RStudio (running R 3.5.2) that scrapes data from a particular website. The script reaches out to a website, uses download.file to pull the underlying code, and uses tags to extract the desired data.

The script runs without error in RStudio, but when I try to run the code in the "Execute R Script" node in Azure ML it throws a 0063 error saying that it "cannot reach URL ". The code runs perfectly up until it tries to reach out to the URL. (see code below)

I have tried switching the R version in Azure ML--neither of the 3 options work.

for(a in 1:length(job_url)) {
     download.file(url, destfile = filename, quiet=TRUE)
      ...
}

I expect the script to run the same in RStudio and Azure ML. Any ideas how to get this script to run in Azure ML the same way it runs in RStudio?


Solution

  • For security reasons, all networking from or to R code in Execute R Script modules is blocked by Azure.

    https://learn.microsoft.com/en-us/azure/machine-learning/studio-module-reference/execute-r-script#networking