Search code examples
hadoopelastic-map-reduce

ElasticMapReduce: number of mappers/reducers per EC2 type


I am wondering whether the number of mappers and reducers would be different based on the instance type of EC2 servers you choose? I found Large instance is using 3 mappers and 1 reducers. Would that be the same for every other type (for example, xLarge instance)? I know I can override it thru bootstraping but just wondering.


Solution

  • No, it isn't same for every instance types. Amazon has a concept of Hadoop Default Configurations, which is controlled by AMI versions, latest one is AMI-2.3 . These configurations define the default value for a number of hadoop configurations, for example for a m1.xlarge, following configurations are set by default of you use AMI-2.3

    Parameter   Value
    HADOOP_JOBTRACKER_HEAPSIZE  6912
    HADOOP_NAMENODE_HEAPSIZE    2304
    HADOOP_TASKTRACKER_HEAPSIZE 384
    HADOOP_DATANODE_HEAPSIZE    384
    mapred.child.java.opts  -Xmx768m
    mapred.tasktracker.map.tasks.maximum    8
    mapred.tasktracker.reduce.tasks.maximum 3
    

    For more see the following: http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/HadoopMemoryDefault_AMI2.3.html http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/emr-hadoop-config.html