Search code examples
hadoopsqoop

error while performing sqoop - merge


I was trying to sqoop merge two data sets by importing the data from the netezza server. below are the data sets with the numeric as id and letters as name:

Both of the below tables are imported from netezza using the commands:

sqoop import --connect neteeza_url --username uname --password pwd --table sqoop_merge_1 --hive-import --warehouse-dir hdfs_pth --create-hive-table sqoop_merge_1 -m 1

sqoop_merge_1:

1,a
2,b
3,c
4,d
5,e

sqoop_merge_2:

4,z
5,y

and the commands are:

sqoop merge --new-data hdfs_path/sqoop_merge_2 --onto hdfs_path/sqoop_merge_1 --target-dir hdfs_path/sqoop_merge_output --jar-file jar_file_path/sqoop_merge_class_name.jar --class-name sqoop_merge_class_name --merge-key id

I created the jar file by using the codegen command:

sqoop codegen --connect netezza_url --username uname --password -pwd --table sqoop_merge_1

But I am getting the following error:

java.io.IOException: Cannot join values on null key. Did you specify a key column that exists?

Tried all the ways i knew but still getting the error.

Please help.


Solution

  • As you are sure about id column existence, it could be an issue due to case-sensitivity.

    Check if you specified ID in Netezza?

    If yes, try with --merge-key ID.