Search code examples
javahadoophadoop-yarnhadoop2

Container allocation code in YARN (Hadoop)


I am trying to tinker with the YARN container allocation code. By container allocation, I mean the decision to place the container on a specific machine in the cluster.

I want to write my own container allocation code. To begin with, I am running Hadoop in pseudo-distributed mode with YARN. I am trying to locate the relevant points in the source code. So far, using print statements, I have been able to pinpoint the class hadoop-source-code/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/api/impl/pb/client/ApplicationMasterProtocolPBClientImpl.java#allocate where allocation takes place. However, I am unable to narrow it down further. Going further into this method, I have not been able to print anything.

To recap- I would like to locate the exact point in the Hadoop source code where I would need to write my own code to replace the existing container allocation mechnism.


Solution

  • I have not been able to print anything
    

    At first, I thought logging is application specific but all information related to resource manager is under log file named hadoop-{username}-resourcemanager-{username}.log under log folder. Instead of print statement, I used LOG.info for debugging.

    Location of allocation mechanism in hadoop source code
    

    I am using FIFO scheduler and allocation mechanism is under method FifoScheduler#assignContainersOnNode which is called from FifoScheduler#assignContainers which is called from FifoScheduler#nodeUpdate method.

    There is FifoScheduler#handle method (more information here), which keeps on tracking of different events. NODE_UPDATE is among one of those which is triggered often and hence assignment of container on given node takes place.