Search code examples
rreadr

R - IMDb datasets not loading


I'm trying to write a function which will download and load files from IMDb dataset page available here: https://datasets.imdbws.com/

Problem is that function execution is downloading the file, but loading it to any object.

I have created separate steps code, which is working fine.

url <- "https://datasets.imdbws.com/name.basics.tsv.gz"
tmp <- tempfile()
download.file(url, tmp)

name_basics <- readr::read_tsv(
  file = gzfile(tmp),
  col_names = TRUE, 
  quote = "",
  na = "\\N",
  progress = FALSE
)

File is downloaded and loaded to the name_basics. But when I tried to create function code, then there is no data loaded. What I have done wrong?

Function code

imdbTSVfiles <- function(fileName){
  url <- paste0("https://datasets.imdbws.com/",fileName,".tsv.gz")
  tmp <- tempfile()
  download.file(url, tmp)

  name <- readr::read_tsv(
      file = gzfile(tmp),
      col_names = TRUE,
      quote = "",
      na = "\\N")
}

imdbTSVfiles("name.basics")

Expected result: provided file name downloaded and loaded.


Solution

  • You need to store data to dynamic named variable, which can be easily achieved using assign().

    imdbTSVfiles <- function(fileName){
      url <- paste0("https://datasets.imdbws.com/",fileName,".tsv.gz")
      tmp <- tempfile()
      download.file(url, tmp)
    
      assign(fileName,
             readr::read_tsv(
               file = gzfile(tmp),
               col_names = TRUE,
               quote = "",
               na = "\\N"),
      envir = .GlobalEnv)
    }
    
    imdbTSVfiles("name.basics")
    

    This should store data in name.basics variable.