Search code examples
hivehiveqlparquethadoop2hive-partitions

Hive Partitioned Table - trying to load data from one table to a partitioned table in my Hive and getting [Error 10044]*


So I have a table with 20 columns and i created another partitioned table - using 2 partition values, now when i try to load data from my table with 20 columns into the other partitioned tables i get error saying my partitioned table has more columns than the table i am inserting data from

my create table statement:

  create table flight_data_parquet(
  YEAR INT,
  FL_DATE STRING,
  UNIQUE_CARRIER STRING,
  AIRLINE_ID INT,
  CARRIER STRING,
  TAIL_NUM STRING,
  FL_NUM INT,
  ORIGIN_AIRPORT_ID INT,
  ORIGIN_AIRPORT_SEQ_ID INT,
  ORIGIN STRING,
  DEST_AIRPORT_ID INT,
  DEST_AIRPORT_SEQ_ID INT,
  DEST STRING,
  DEP_DELAY FLOAT,
  ARR_DELAY FLOAT,
  CANCELLED TINYINT,
  DIVERTED TINYINT,
  DISTANCE INT)
partitioned by (Month INT, DAY_OF_MONTH INT)stored AS PARQUET;

-insert statement:

insert into table flight_data_parquet partition(month=1, day_of_month) 
select  YEAR,FL_DATE,
  UNIQUE_CARRIER,
  AIRLINE_ID,
  CARRIER,
  TAIL_NUM,
  FL_NUM,
  ORIGIN_AIRPORT_ID,
  ORIGIN_AIRPORT_SEQ_ID,
  ORIGIN,
  DEST_AIRPORT_ID,
  DEST_AIRPORT_SEQ_ID,
  DEST,
  DEP_DELAY,
  ARR_DELAY,
  CANCELLED,
  DIVERTED,
  DISTANCE, month, day_of_month
from flight_data_v2 where month=1;

the error i get is -

FAILED: SemanticException [Error 10044]: Line 1:18 Cannot insert into target table because column number/types are different 'day_of_month': Table insclause-0 has 19 columns, but query has 20 columns.
hive (flights)> 

Solution

  • month=1 in the partition specification partition(month=1, day_of_month) - is a static partition and value is already specified , remove month from the select query. Only day_of_month (dynamic partition) should be in the select:

    insert into table flight_data_parquet partition(month=1, day_of_month)  -- Month=1 is a static partition 
    select  YEAR,FL_DATE,
      UNIQUE_CARRIER,
      AIRLINE_ID,
      CARRIER,
      TAIL_NUM,
      FL_NUM,
      ORIGIN_AIRPORT_ID,
      ORIGIN_AIRPORT_SEQ_ID,
      ORIGIN,
      DEST_AIRPORT_ID,
      DEST_AIRPORT_SEQ_ID,
      DEST,
      DEP_DELAY,
      ARR_DELAY,
      CANCELLED,
      DIVERTED,
      DISTANCE, day_of_month
    from flight_data_v2 where month=1;