I need to be able to get the yarn applicationId from a mapreduce job. I can't find any API to do that. An Example of my mapreduce job:
Configuration conf = new Configuration();
Job job = Job.getInstance(conf, "word count");
job.setJarByClass(WordCount.class);
job.setMapperClass(TokenizerMapper.class);
job.setCombinerClass(IntSumReducer.class);
job.setReducerClass(IntSumReducer.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
FileInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));
job.submit();
job.waitForCompletion(true);
Is there an API similar to job.getJobId
to retrieve the yarn applicationId? I know about yarn application -list
command but I need to be able to know the applicationId in my program through some kind of API. It looks like jobId is same as applicationId execpt for the prefix ('job' vs 'application') which I could parse but I am hoping there is something from the API I can use.
I ended up parsing the jobId, removing 'job' prefix and adding 'application' prefix as it appears applicationId is not exposed for mapreduce job and it is basically the same id as jobId with different prefix. It's a hacky approach but works for now.