Search code examples
oozieemr

Oozie sample on EMR


Can someone please explain me what is a name node and job tracker for oozie action when working on EMR(EMRFS). I do understand that name node is specific to hdfs but if i'm using EMRFS then what should be the value of it in oozie.


Solution

  • name-node should be the namenode FQDN:port or IP:port of the EMR master where HDFS namenode daemon runs. job-tracker is the YARN resource managers address. They remain unchanged with or without EMRFS because OOZIE still uses HDFS(not S3). Based on the Action , the YARN containers(mappers/reducers) might use EMRFS and you do not need to set anything for it.

    You can see this ports list to find the necessary ports for EMR : http://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-4.2.0/emr-release-differences.html#w2ab1c66c15

    You can also find them in fs.default.name ,mapred.job.tracker settings of core-site.xml / yarn-site.xml / mapred-site.xml files.