Search code examples
razurejupyter-notebookjupyter-irkernel

Download a custom dataset in Azure ML Jupyter/iPython Notebook using R


I need to download a custom dataset in an Azure Jupyter/iPython Notebook. My ultimate goal is to install an R package. To be able to do this the package (the dataset) needs to be downloaded in code. I followed the steps outlined by Andrie de Vries in the comments section of this post: Jupyter Notebooks with R in Azure ML Studio.

Uploading the package as a ZIP file was without problems, but when I run the code in my notebook I get an error:

Error in curl(x$DownloadLocation, handle = h, open = conn): Failure when receiving data from the peer Traceback:

  1. download.datasets(ws, "plotly_3.6.0.tar.gz.zip")
  2. lapply(1:nrow(datasets), function(j) get_dataset(datasets[j, . ], ...))
  3. FUN(1L[[1L]], ...)
  4. get_dataset(datasets[j, ], ...)
  5. curl(x$DownloadLocation, handle = h, open = conn)

So I simplified my code into:

library("AzureML")
ws <- workspace()
ds <- datasets(ws)
ds$Name

data <- download.datasets(ws, "plotly_3.6.0.tar.gz.zip")
head(data)

Where "plotly_3.6.0.tar.gz.zip" is the name of my dataset of data type "Zip". Unfortunately this results in the same error. To rule out data type issues I also tried to download another dataset of mine which is of data type "Dataset". Also the same error.

Now I change the dataset I want to download to one of the sample datasets of AzureML Studio. "text.preprocessing.zip" is of datatype Zip

data <- download.datasets(ws, "text.preprocessing.zip")

"Flight Delays Data" is of datatype GenericCSV

data <- download.datasets(ws, "Flight Delays Data")

Both of the sample datasets can be downloaded without problems.

So why can't I download my own saved dataset?

I could not find anything helpful in the documentation of the download.datasets function. Not on rdocumentation.org, nor on cran.r-project.org (page 17-18).


Solution

  • It seems the error I got was due to a bug in the (then early) Azure ML Studio.

    I tried again after the reply of Daniel Prager only to find out my code works as expected without any changes. Adding the id and auth parameters was not needed.