Search code examples
hadoophadoop-yarnoozie

Getting YARN action application ID in subsequent action


I am running OOZIE workflow and doing map only distributed model fitting within map-reduce action. As there are many mappers, I have written a code which compiles YARN logs of all mapper tasks using yarn logs -applicationId application_x where application_x is parent application ID of all map tasks. Now I want to make this summarization part of workflow so I need to get application_x dynamically which is application ID of previous action. Is there any way by which I can get this?


Solution

  • I have not tested this, but I think you can get this with a workflow EL function:

    wf:actionExternalId(String node)
    
    It returns the external Id for an action node, or an empty string if
    the action has not being executed or it has not completed yet.
    

    So in a node after the map reduce job has completed, you should be able to use something likeL

    wf:actionExternalId('mapred-node-name')
    

    I suspect it will return job_xxx instead of application_xxx, but you can probably handle that OK.