The query that I was running is :
hadoop com.teradata.hadoop.tool.TeradataExportTool -url
jdbc:teradata://tdx/TMODE=ANSI,CHARSET=UTF8,database=db -username
xxx -password xxx -jobtype hcat -sourcetable customers -
sourcedatabase xxx -nummappers 1 -targettable customers
While running the job, I get this exception :
com.teradata.connector.common.exception.ConnectorException: DATE Field data type is not supported at com.teradata.connector.hive.utils.HiveSchemaUtils.lookupHiveDataTypeByName(HiveSchemaUtils.java:475) at com.teradata.connector.hcat.utils.HCatSchemaUtils.getRecordSchema(HCatSchemaUtils.java:396) at com.teradata.connector.hcat.processor.HCatInputProcessor.inputPreProcessor(HCatInputProcessor.java:89) at com.teradata.connector.common.tool.ConnectorJobRunner.runJob(ConnectorJobRunner.java:116) at com.teradata.connector.common.tool.ConnectorExportTool.run(ConnectorExportTool.java:62) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) at com.teradata.hadoop.tool.TeradataExportTool.main(TeradataExportTool.java:29) Is there any way I can get over this issue ? The crate table statement of the hive table is :
CREATE EXTERNAL TABLE `customers`(
`row_id` string,
`source_start_date` date,
`source_start_timestamp` timestamp,
`target_start_timestamp` timestamp,
`target_start_date` date,
`source_end_date` date,
`source_end_timestamp` timestamp,
`target_end_date` date,
`target_end_timestamp` timestamp,
`active` string,
`is_deleted` string,
`status_flag` string,
`first_name` string,
`last_name` string,
`city` string,
)
PARTITIONED BY (
`join_dt` date)
ROW FORMAT SERDE
'org.apache.hadoop.hive.ql.io.orc.OrcSerde'
STORED AS INPUTFORMAT
'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat'
OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat'
LOCATION
'xxx'
TBLPROPERTIES (
'transient_lastDdlTime'='1461674429')
Any pointers are appreciated.
There was no way that I found to make the date work,but I found a work around. I create a temp table with the date columns casted to String. e.g., you created a table like :
Create table person ( person_id int, joining_date date);
From this table you can create temp hive table like:
Create table temp_person as select person_id, cast(joining_date as string) from person;
In the target teradata side you can give the datatype as Date :
Create multiset table person ( person_id int, joining_date date);
And now you can run the job like:
hadoop com.teradata.hadoop.tool.TeradataExportTool -url
jdbc:teradata://tdx/TMODE=ANSI,CHARSET=UTF8,database=db -username
xxx -password xxx -jobtype hcat -sourcetable temp_person -
sourcedatabase xxx -nummappers 1 -targettable person
This works fine. But due to the overhead of copying table, the performance slows down a little.