Search code examples
mapreducehadoop-yarnhadoop2

Get yarn applicationId from a submitted mapreduce job


I need to be able to get the yarn applicationId from a mapreduce job. I can't find any API to do that. An Example of my mapreduce job:

Configuration conf = new Configuration();
Job job = Job.getInstance(conf, "word count");
job.setJarByClass(WordCount.class);
job.setMapperClass(TokenizerMapper.class);
job.setCombinerClass(IntSumReducer.class);
job.setReducerClass(IntSumReducer.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
FileInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));
job.submit();
job.waitForCompletion(true);

Is there an API similar to job.getJobId to retrieve the yarn applicationId? I know about yarn application -list command but I need to be able to know the applicationId in my program through some kind of API. It looks like jobId is same as applicationId execpt for the prefix ('job' vs 'application') which I could parse but I am hoping there is something from the API I can use.


Solution

  • I ended up parsing the jobId, removing 'job' prefix and adding 'application' prefix as it appears applicationId is not exposed for mapreduce job and it is basically the same id as jobId with different prefix. It's a hacky approach but works for now.