Search code examples
apache-sparkhadoop-yarn

Yarn shows more resources than cluster have


I start an EMR cluster with 3 m3.xlarge instance (1 master & 2 slaves) and i have some troubles.

From aws documentation a m3.xlarge instance has 4 vcpu ( https://aws.amazon.com/ec2/instance-types/ ) . What does it means? This means 4 threads or 4 core with 2 thread each core? I ask you that, because when i open hadoop UI(port 8088) appear to be 8 available vcore per instances, but from what i experienced, cluster behave like a 2 instances with 4 vcore per instances. Am i wrong? Or it's a bug from Amazon or yarn?


Solution

  • The value 8 vcores comes from the default Yarn property

    <property>
        <name>yarn.nodemanager.resource.cpu-vcores</name>
        <value>8</value>
        <description>Number of vcores that can be allocated for containers. This is used by the RM scheduler when allocating resources for containers. This is not used to limit the number of physical cores used by YARN containers.</description>
    </property>
    

    Though it is defined to an higher value than the actual number of vcores in the instance, the containers will be created based on the number of vcores actually available per nodemanager instance.

    Modify the value of this property in yarn-site.xml as per the instance vcores.