I'm trying to load multiple parquet files from my Dropbox folder's URL (I did not set those files to local just to save my computer memory). I used the following code, but it returns nothing.
library(arrow)
library(dplyr)
files <- list.files(path = "https://www.dropbox.com/sh/g8ck3t859uahkdi/AADw-kp7EYfU-SMZc4mmtCM2a?dl=1", pattern = "*.parquet", full.names = T)
tbl <- sapply(files, read_parquet, simplify=FALSE) %>%
bind_rows(.id = "id")
I've referenced this and this post, but couldn't figure out how to.
I used windows machine for this task (do I need to set mode
to "wb"
?) but may switch to Mac if need be.
If we use the second option of downloading to a destination folder, then
library(arrow)
library(purrr)
url <- "https://www.dropbox.com/sh/g8ck3t859uahkdi/AADw-kp7EYfU-SMZc4mmtCM2a?dl=1"
filezip <- "/path/to/yourfolder/filenew.zip"
new_folder <- "/path/to/yourfolder/filenew"
download.file(url, filezip, mode = "wb")
unzip(filezip, exdir = new_folder)
files <- list.files(path = new_folder,
pattern = "\\.parquet$", full.names = TRUE)
tbl <- map_dfr(files, read_parquet)
nrow(tbl)
#[1] 168019