I am currently working on a job which copies data from a staging table to the final table. The column in the staging table which is used for partition on the final table has multiple records with single quotes (e.g. supplies'A, demand'A etc). Due to this the impala INSERT OVERWRITE statement is failing with the following message:
Query: insert OVERWRITE rec_details ( rec_id, rec_name, rec_value ) PARTITION (rec_part) SELECT rec_id, rec_name, rec_value, rec_name FROM staging_rec_details Query submitted at: 2017-06-12 03:23:22 (Coordinator: http://hostname:port) Query progress can be monitored at: http://hostname:port/query_plan?query_id=ea4e14229d1c0119:a839f51500000000 WARNINGS: TableLoadingException: Failed to load metadata for table: rec_details CAUSED BY: IllegalStateException: Invalid partition name: rec_part=-supplies'A
DDL Statements are as follows:
--DDL 1 - Staging Table
CREATE EXTERNAL TABLE staging_rec_details(
rec_id STRING,
rec_name STRING,
rec_value STRING
)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\007'
LINES TERMINATED BY '\001'
--WITH SERDEPROPERTIES ('serialization.format'='\t', 'field.delim'='\t')
STORED AS TEXTFILE
LOCATION '/staging/staging_rec_details'
--DDL 2 - Final Table
CREATE EXTERNAL TABLE rec_details(
rec_id STRING,
rec_name STRING,
rec_value STRING
)
PARTITIONED BY (rec_part STRING)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\007'
LINES TERMINATED BY '\001'
--WITH SERDEPROPERTIES ('serialization.format'='\t', 'field.delim'='\t')
STORED AS PARQUET
LOCATION '/data/rec_details'
Following is the Impala statement used for insering records:
--Impala SQL
INSERT OVERWRITE rec_details
(
rec_id, rec_name, rec_value
)
PARTITION (rec_part)
SELECT
rec_id, rec_name, rec_value, rec_name
FROM staging_rec_details
How can I insert data into the final table when the partition column has a special character like single quote ?
The issue was resolved by replacing the special character :
-- Modified Impala SQL
INSERT OVERWRITE rec_details
(
rec_id, rec_name, rec_value
) PARTITION (rec_part)
SELECT
rec_id, rec_name, rec_value,
regexp_replace(rec_name,'\'','')
FROM staging_rec_details