Search code examples
hadoopmapreducedistcp

How do I determine if a call to distcp2 was successful?


The best advice I could find online is that you should either compare the files after transfer or make a second run with -update, and the second is considered unreliable.

Is there a way of determining if the call even returned without an exception?


Solution

  • If distcp command has failed it will return with a non zero exit meaning that something go wrong and writing in the console the exception reporting the job as a failure.

    At the end distcp command execute a MapReduce job that you can track in the ResourceManager (You can use the API to automatize the validation). In the job's counters you can see how many files were copied and how many were skipped, in a similar way you can know if the job has finished successfully or not.