I have two tables in impala and I want to move the data from one to another. Both tables have hdfs path like
/user/hive/db/table1 or table2/partitiona/partitionb/partitionc/file
I know the procedure with INSERT INTO
to move the data from one table to another.
What I do not know is how to move also the files in the hdfs paths or if this happens automatically with the INSERT INTO statement
Also, if a table is sorted in the creation settings, if any data insert into it it will be sorted too?
It happens automatically and done by hive. When you do INSERT INTO table1 SELECT * FROM table2
, hive copies data from /user/hive/db/table1
to table2/partitiona/partitionb/partitionc/file
.
You do not have to move anything. You may need to analyze table1 for better performance.
Answer to your second question, if you use sort by
while creating table1, then data will be automatically sorted by in table1 irrespective of data sorted or unsorted in table2.