Search code examples
hivehiveqlhive-partitions

How to insert overwrite partitions only if partitions not exists in HIVE?


How to insert overwrite partitions only if partitions not exists in HIVE?

Just as title. I'm working on something that always needs to rewrite hive tables. I have tables that has multiple partitions and I only want to insert new partitions without change exist partitions when I rerun the code after change.


Solution

  • You can join with existing partition list and add where it is NULL condition (not joined only). Also you can use NOT EXISTS (it will generate the same plan as left join in Hive) Like this:

       insert overwrite table target_table partition (partition_key)
        select col1, ... coln, s.partition_key
          from source s 
               left join (select distinct partition_key --existing partitions
                           from target_table
                         ) t on s.partition_key=t.partition_key
         where t.partition_key is NULL; --no partitions exists in the target