Search code examples
google-bigquerygzipparquet

Failed to create table: Error while reading data, error message: Input file is not in Parquet format


Trying to load a *.parquet.gz file as Parquet in BigQuery I get this issue.

Isn't Bigquery supposed to recognize that this is a compressed parquet file?

When I decompress it and load it as .parquet it works.


Solution

  • Compressing a parquet with gzip defeats most of the benefits of Parquet columnar compression and reduces our ability to process or parallelize the import.

    What BigQuery supports is compression of data blocks on parquet not the whole file itself.