Java applications are executed in the Hadoop cluster as map-reduce job with a single Mapper task. If a java mapreduce job(not hive or any other job just a direct mapreduce job) is a part of oozie we get a single mapper launcher and actual mapreduce job runs independently. So is there a way to link the launcher and the actual mapreduce job run? like get the jobid of the actual action running with launcher jobid? any command to know?
We can get the launcher id for any child id from the logs link that can be obtained from
http://<rm httpaddress:port>/ws/v1/history/mapreduce/jobs/<jobid>/jobattempts
There we get an xml which contains the logs link. If we parse through the syslog in that link we have a string like
Service: job_
Use this regular expression and find out the launcher id. If there is a launcher then we can get it from here.(Even for java actions in oozie workflow) The actual line will be something like this
INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Kind: mapreduce.job, Service: <jobid>
The jobid after the Service:
is launcher job id