Search code examples
amazon-s3hivehive-metastoretrinohive-partitions

How read data partitons in S3 from Trino


I'm trying to read data partitons in S3 from Trino. What I did exactly:

I uploaded my data with all partitions into S3. I have a specified avro schema, I put it in file local system.

Then I created an external hive table to point to the data location in S3 and to the avro schema in file local system. Table is created.

Then, normaly I can query my data and partitions in S3 from Trino.

Trino>select * from hive.default.my_table;

It return only columns names.

trino>select * from hive.default."my_table$partitions";

it return only name of partitions.

Could you please suggest me a solution how can I read data partitons in S3 from Trino ?

Knowing that I'm using Apache Hive 2, even when I query the table in hive to return the table partitions, it return Ok, and display any thing. I think because Hive 2 we should use MSCK command


Solution

  • Faced the same issue. Once the table is created, we need to manually sync up the schema to the metastore using the below command of trino.

    CALL system.sync_partition_metadata('<schema>', '<table>', 'ADD');
    

    Ref.: https://trino.io/episodes/5.html