I copy the structure of schema2.card_master over to schema1.card_master using
hive> create table schema1.card_master like schema2.card_master;
That works, and it is partitioned as was the original on a field. This new table has hundreds of fields so they are inconvenient to list out, but I want all the fields populated from the original table using a Join filter. Now I want to populate it using a JOIN:
hive> insert overwrite table schema1.card_master (select * from schema2.card_master ccm INNER JOIN schema1.accounts da on ccm.cm13 = da.cm13);
FAILED: SemanticException 1:23 Need to specify partition columns because the destination table is partitioned. Error encountered near token 'cmdl_card_master'
I checked the partition that was copied over, and it was a field mkt_cd that could take on 2 values, US or PR.
So I try
hive> insert overwrite table schema1.card_master PARTITION (mkt_cd='US') (select * from schema2.card_master ccm INNER JOIN schema1.accounts da on ccm.cm13 = da.cm13);
FAILED: SemanticException [Error 10044]: Line 1:23 Cannot insert into target table because column number/types are different ''US'': Table insclause-0 has 255 columns, but query has 257 columns.
hive>
What is going on here? Is there any work around to load my data without having to explicitly mention all the fields in the Select statement for schema2.card_master ?
select *
selects columns from each table in a join. Use select ccm.*
instead of select *
to select columns from ccm
table only. Also remove static partition specification ('US'), use dynamic instead, because ccm.*
contains partition column, and when you are loading static partition you should not have partition column in the select.
set hive.exec.dynamic.partition=true;
set hive.exec.dynamic.partition.mode=nonstrict;
insert overwrite table schema1.card_master partition(mkt_cd) --dynamic partition
select ccm.* --use alias
from schema2.card_master ccm
INNER JOIN schema1.accounts da on ccm.cm13 = da.cm13
;