Search code examples
hadoophadoop-yarnapache-samza

Does Samza work with ResourceManager in HA?


Does anyone have Samza working with resource manager in HA? If so, what do I set yarn.resourcemanager.hostname to in yarn-site.xml?

If I set it to the first of my RMs, the the job submission works ok if I submit the job from that RM and the RM is the active one. If the RM machine that I run the job submission from is not active, I get connection refused errors on port 8032.


Solution

  • Yes we have samza running with RM in HA mode. Basically ensure yarn-site.xml has the properties defined below set for sure. This will ensure that the job submission will try connecting to the other RM if the first one doesn't succeed.

        <property>
          <name>yarn.resourcemanager.hostname</name>
          <value>yarn_resource_manager_hostname</value>
        </property>
        <property>
          <name>yarn.resourcemanager.ha.enabled</name>
          <value>true</value>
        </property>
        <property>
          <name>yarn.resourcemanager.cluster-id</name>
          <value>yarn_cluster_id</value>
        </property>
        <property>
          <name>yarn.resourcemanager.ha.rm-ids</name>
          <value>rm1,rm2</value>
        </property>
        <property>
          <name>yarn.resourcemanager.hostname.rm1</name>
          <value>yarn_resource_manager_hostname</value>
        </property>
        <property>
          <name>yarn.resourcemanager.hostname.rm2</name>
          <value>yarn_resource_manager2_hostname</value>
        </property>
        <property>
          <name>yarn.resourcemanager.address.rm1</name>
          <value>yarn_resource_manager_hostname:8032</value>
        </property>
        <property>
          <name>yarn.resourcemanager.address.rm2</name>
          <value>yarn_resource_manager2_hostname:8032</value>
        </property>