I'm pretty new to parquet file format and I'm using the read_parquet()
(in the arrow
package) to load parquet file (stored in my Dropbox share folder) into R. However, I received the following error message
library(arrow)
df <- read_parquet("https://www.dropbox.com/s/mysgf4sojpjgyp7/part-394.parquet?dl=1")
Error: Invalid: Unrecognized filesystem type in URI: https://www.dropbox.com/s/mysgf4sojpjgyp7/part-394.parquet?dl=1
What might cause this problem here and do I need to partition the url link beforehand?
The file reading functions in the arrow
package do not yet support HTTP[S]
URIs. We hope to add this feature in a future release (ARROW-7594). In the meantime:
If you have Dropbox installed on the computer where you're running this, use the local path to the file instead of the HTTPS
URI.
If you do not have Dropbox installed, then download the file first, like this:
myfile <- tempfile()
download.file(
"https://www.dropbox.com/s/mysgf4sojpjgyp7/part-394.parquet?dl=1",
myfile,
mode = "wb"
)
df <- read_parquet(myfile)