I am attempting to read-in a relatively large parquet file (~4M rows, ~100 columns). Can someone please help me interpret the following error messages?
I have no trouble reading/writing files in csv form after converting them to parquet files, I am attempting to read it in using arrow::read_parquet
to little avail. When I attempt to read it in, I am getting the following errors.
library(tidyverse)
library(arrow)
par <- file.path(dir, 'path', 'to', 'my', 'file.parquet') %>%
read_parquet
glimpse(par)
# Error in setalloccol(newx) :
# Internal error: length of names (0) is not length of dt (109)
and I get
names(par)
#NULL
Having said this, I can observe that the csv version and the parquet version have the same number of rows and columns
The other common error I receive is:
Error in `[[<-.data.frame`(`*tmp*`, "..row.names..", value = 1:3279887) :
replacement has 3279887 rows, data has 0
This problem was specific to the arrow version and has since been patched. Was previously using version 1.0.0 but cannot replicate the error on 4.0.1