Search code examples
hadoopmapreducehadoop-yarnhadoop2

Hadoop gen1 vs Hadoop gen2


I am a bit confused about place of tasktracker in Hadoop-2.x.

Daemons in Hadoop-1.x are namenode, datanode, jobtracker, taskracker and secondarynamenode

Daemons in Hadoop-2.x are namenode, datanode, resourcemanager, applicationmaster, secondarynamenode.

This means Jobtracker has split up into: resourcemanager and applicationmaster

So where is tasktracker?


Solution

  • In YARN (the new execution framework in Hadoop 2), MapReduce doesn't exist in the way it did before.

    YARN is a more general purpose way to allocate resources on the cluster. ResourceManager, ApplicationMaster, and NodeManager now consist of the new YARN execution framework. The NodeManager is the daemon on every node, so I guess you could say that replaced the TaskTracker. But now it just gives processes instead of just map tasks and reduce tasks.

    MapReduce is still there, but it is now an "application" of YARN.

    Here is an introduction to YARN, which will go into much more depth: http://hortonworks.com/blog/introducing-apache-hadoop-yarn/