Search code examples
parquetsparklyr

How limit number of lines read from a parquet file in sparklyr


I have a huge parquet file that dont fits in memory nor in disk when read, theres a way to use spark_read_parquet to only read the first n lines?


Solution

  • This might be a hacky way, but

    spark_read_parquet(..., memory=FALSE) %>% head(n)
    

    seems to do the job for me.