Search code examples
insertsavehivecreate-tablecsv

How do I create a tab separated file from a hive query?


I am trying to create a tab separated value from a hive query.

Following the Apache Hive wiki, I composed my query as:

INSERT OVERWRITE LOCAL DIRECTORY '/home/luca/query_results/'
ROW FORMAT DELIMITED 
FIELDS TERMINATED BY  '\t' ESCAPED BY '"' LINES TERMINATED BY '\n' 
STORED AS TEXTFILE
SELECT * FROM mytable ;

(please note, that the real query is quite complex, I know I could have used hive -e "Select * from mytable > output in this simple case).

The query seems to work, but in the query_results directory I find one file with .snappy extension, and when I open it it seems to be a binary file, not a text file.

What is wrong with my query? How can I get the results of my query in a nicely formatted text file (I want to write an hive script that returns several such files. If I can even specify the name of the file from within hive it would be a bonus).


Solution

  • Adding the comment as answer

    The .snappy file issue could be resolved by unsetting the compression. Use the following :

          set hive.exec.compress.output=false
    

    The once snappy compressed file cannot be decompressed becuase of its encoded structure.

    The above compression property can either be set permanently in the hive-site.xml or you can do it manually every time you run an insert query