Search code examples
sqoop

Sqoop Import Fails


I'm trying to import a Table from Oracle to Hive using Sqoop. I used the following command:

sqoop-import --connect jdbc:<connection> --table test1 --username test --password test --hive-table hive_test --create-hive-table --hive-import -m 1    

But this gives me the error

 Encountered IOException running import job: org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory <hdfs path> already exists

So I read online in many forums and it said that I should delete the directory and run the command again. I did exactly that, but I still keep getting the error.


Solution

  • You need to understand working of Sqoop hive Import.

    • Import data to HDFS <some-dir>
    • Create hive table <some-table> IF NOT EXISTS
    • LOAD data inpath '<some-dir>' into table <some-table>

    You are getting the error at step 1.

    Output directory <hdfs path> already exists
    

    Delete this <hdfs path> and proceed.

    Better way:

    No need to delete this manually everytime.

    Use --delete-target-dir in the command. It will

    Delete the import target directory if it exists

    P.S. No need to use --create-hive-table with --hive-import. --hive-import by default create table for you.