I am trying to create a tab separated value from a hive query.
Following the Apache Hive wiki, I composed my query as:
INSERT OVERWRITE LOCAL DIRECTORY '/home/luca/query_results/'
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '\t' ESCAPED BY '"' LINES TERMINATED BY '\n'
STORED AS TEXTFILE
SELECT * FROM mytable ;
(please note, that the real query is quite complex, I know I could have used hive -e "Select * from mytable > output
in this simple case).
The query seems to work, but in the query_results directory I find one file with .snappy
extension, and when I open it it seems to be a binary file, not a text file.
What is wrong with my query? How can I get the results of my query in a nicely formatted text file (I want to write an hive script that returns several such files. If I can even specify the name of the file from within hive it would be a bonus).
Adding the comment as answer
The .snappy file issue could be resolved by unsetting the compression. Use the following :
set hive.exec.compress.output=false
The once snappy compressed file cannot be decompressed becuase of its encoded structure.
The above compression property can either be set permanently in the hive-site.xml or you can do it manually every time you run an insert query