Search code examples
sqoop

warehouse dir argument and re-attempt of map reduce task


I am using warehouse-dir argument for a reason and not using target-dir in my sqoop job. If Map-reduce task is re-attempted, it throws error given below.

How do I fix this?

Since it is only a re-attempt, it makes no difference if I delete directoy before execution of task.

from oozie logs:

Fetching child yarn jobs
tag id : oozie-8b2856ac4965b4431c0e9a9b80fb2bda
2018-05-23 22:45:15,153 [main] INFO  org.apache.hadoop.yarn.client.RMProxy  - Connecting to ResourceManager at ip.compute.internal/172.31.4.192:8032
Child yarn jobs are found - application_1527093425690_0231

Found [1] Map-Reduce jobs from this launcher
Killing existing jobs and starting over:
error message: output directory already exists:

2018-05-23 22:45:26,374 [main] ERROR org.apache.sqoop.tool.ImportTool  - Import failed: org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory /path_to_directory_in_hdfs/already exists
    at org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:146)
    at org.apache.hadoop.mapreduce.JobSubmitter.checkSpecs(JobSubmitter.java:270)
    at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:143)
    at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1307)
    at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1304)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920)
    at org.apache.hadoop.mapreduce.Job.submit(Job.java:1304)
    at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1325)
    at org.apache.sqoop.mapreduce.ImportJobBase.doSubmitJob(ImportJobBase.java:203)
    at org.apache.sqoop.mapreduce.ImportJobBase.runJob(ImportJobBase.java:176)
    at org.apache.sqoop.mapreduce.ImportJobBase.runImport(ImportJobBase.java:273)
    at org.apache.sqoop.manager.SqlManager.importTable(SqlManager.java:692)
    at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:513)
    at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:621)
    at org.apache.sqoop.tool.JobTool.execJob(JobTool.java:243)
    at org.apache.sqoop.tool.JobTool.run(JobTool.java:298)
    at org.apache.sqoop.Sqoop.run(Sqoop.java:147)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
    at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183)
    at org.apache.sqoop.Sqoop.runTool(Sqoop.java:234)
    at org.apache.sqoop.Sqoop.runTool(Sqoop.java:243)
    at org.apache.sqoop.Sqoop.main(Sqoop.java:252)
    at org.apache.oozie.action.hadoop.SqoopMain.runSqoopJob(SqoopMain.java:196)
    at org.apache.oozie.action.hadoop.SqoopMain.run(SqoopMain.java:179)
    at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:60)
    at org.apache.oozie.action.hadoop.SqoopMain.main(SqoopMain.java:48)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:234)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
<<< Invocation of Sqoop command completed <<<

Solution

  • adding --delete-target-dir will delete the directory if exists already