Search code examples
clickhouseduckdb

create parquet file with Clickhouse and read with DuckDB


Following this guide https://clickhouse.com/docs/knowledgebase/mysql-to-parquet-csv-json I've exported from a MySQL server some tables to parquet.

But I'm not able to read these parquet files with DuckDB.

I can inspect the structure:

DESCRIBE SELECT * FROM 'mytable.parquet';

but if I try to read:

select ID from mytable.parquet;
Error: Invalid Error: Unsupported compression codec "7". Supported options are uncompressed, gzip, snappy or zstd

I guess that clickhouse is writing LZ4 compressed parquet files, and duckdb doesn't support them. Can I change the compression format in clickhouse-local?


Solution

  • To change Parquet compression method in ClickHouse, use setting output_format_parquet_compression_method (see all Parquet settings in https://clickhouse.com/docs/en/sql-reference/formats#parquet-format-settings).

    For example:

    select ... format Parquet settings output_format_parquet_compression_method='snappy'