Search code examples
hivehdfsimpala

impala/hive show file format


How can I have impala or hive return the file format of the underlying files on HDFS for a table?

I tried:

SHOW FILES database.table_name

This ilst the files, but the problem is that some people stored parquet files as .parq and others .parquet. Is there anyway to return the file format, such that one could use it in a new create statement?


Solution

  • Use good old show create table mytable.
    You can check the output and it clearly mentions file format. It also shows folder inside which file are stored - you should not try to use file name - let impala decide the name. below is a sample result from impala.

    result  
    CREATE TABLE edh.mytable (
      column1 STRING
     )
    STORED AS PARQUET  --file format
    LOCATION 's3a://cc-mys3/edh/user/hive/warehouse/edh.db/mytable' --folder location