Search code examples
hivehive-partitions

Can i move data from one hive partition to another partition of the same table


My partition is based on year/month/date. Using SimpleDateFormat for week year created a wrong partition . The data for the date 2017-31-12 was moved to 2018-31-12 using YYYY in the date format.

   SimpleDateFormat sdf = new SimpleDateFormat("YYYY-MM-dd");

So what I want is to move my data from partition 2018/12/31 to 2017/12/31 of the same table. I did not find any relevant documentation to do the same.


Solution

  • From what I understood, you would like to move the data from 2018-12-31 partition to 2017/12/31. Below is my explanation of how you can do it.

    #From Hive/Beeline
    ALTER TABLE TableName PARTITION (PartitionCol=2018-12-31) RENAME TO PARTITION (PartitionCol=2017-12-31);
    

    FromSparkCode, You basically have to initiate the hiveContext and run the same HQL from it. You can refer one my answer here on how to initiate the hive Context.

    #If you want to do on HDFS level, below is one of the approaches
    #FromHive/beeline run the below HQL
    ALTER TABLE TableName ADD IF NOT EXISTS PARTITION (PartitionCol=2017-12-31);
    
    #Now from HDFS Just move the data in 2018 to 2017 partition
    hdfs dfs -mv /your/table_hdfs/path/schema.db/tableName/PartitionCol=2018-12-31/* /your/table_hdfs/path/schema.db/tableName/PartitionCol=2017-12-31/
    
    #removing the 2018 partition if you require
    hdfs dfs -rm -r /your/table_hdfs/path/schema.db/tableName/PartitionCol=2018-12-31
    
    #You can also drop from beeline/hive
    alter table tableName drop if exists partition (PartitionCol=2018-12-31);
    
    #At the end repair the table
    msck repair table tableName
    

    Why do i have to repair the table ??