java apache hadoop runtime-error heap-memory

Apache Hadoop 2.6 Java Heap Space Error

I'm getting:

15/04/27 09:28:04 INFO mapred.LocalJobRunner: map task executor complete.
15/04/27 09:28:04 WARN mapred.LocalJobRunner: job_local1576000334_0001
java.lang.Exception: java.lang.OutOfMemoryError: Java heap space
    at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
    at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
Caused by: java.lang.OutOfMemoryError: Java heap space
    at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.init(MapTask.java:983)
    at org.apache.hadoop.mapred.MapTask.createSortingCollector(MapTask.java:401)
    at org.apache.hadoop.mapred.MapTask.access$100(MapTask.java:81)
    at org.apache.hadoop.mapred.MapTask$NewOutputCollector.<init>(MapTask.java:695)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:767)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
    at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
    at java.util.concurrent.FutureTask.run(FutureTask.java:262)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)
15/04/27 09:28:05 INFO mapreduce.Job: Job job_local1576000334_0001 failed    with state FAILED due to: NA
15/04/27 09:28:05 INFO mapreduce.Job: Counters: 0
15/04/27 09:28:05 INFO terasort.TeraSort: done

with Apache Hadoop 2.6 with the following configuration.

mapreduce configuration "mapred.site.xml"

<configuration>

<property>
<name>mapred.job.tracker</name>
<value>n1:54311</value>
</property>

<property>
<name>mapreduce.local.dir</name>
<value>/home/hadoop/hadoop/maptlogs</value>
</property>

<property>
<name>mapreduce.map.tasks</name>
<value>32</value>
</property>

<property>
<name>mapreduce.reduce.tasks</name>
<value>10</value>
</property>

<property>
<name>mapred.child.java.opts</name>
<value>-Xmx1024m</value>
</property>

<property>
<name>mapreduce.task.io.sort.mb</name>
<value>256</value>
<description>Added 04/27 @ 10:09am for testing</description>
</property>

</configuration>

and yarn-site.xml

<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>

<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>n1:8025</value>
</property>

<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>n1:8030</value>
</property>

<property>
<name>yarn.resourcemanager.address</name>
<value>n1:8050</value>
</property>

<property>
<name>yarn.nodemanager.disk-health-checker.enable</name>
<value>false</value>
</property>

<property>
<name>yarn.scheduler.minimum-allocation-mb</name>
<value>4096</value>
<description>Minimum limit of memory to allocate to each container request at the Resource Manager.</description>
</property>

<property>
<name>yarn.scheduler.maximum-allocation-mb</name>
<value>8192</value>
<description>Maximum limit of memory to allocate to each container request at the Resource Manager.</description>
</property>

<property>
<name>yarn.scheduler.minimum-allocation-vcores</name>
<value>1</value>
<description>The minimum allocation for every container request at the RM, in terms of virtual CPU cores. Requests lower than this won't take effect, and the specified value will get allocated the minimum.</description>
</property>

<property>
<name>yarn.scheduler.maximum-allocation-vcores</name>
<value>2</value>
<description>The maximum allocation for every container request at the RM, in terms of virtual CPU cores. Requests higher than this won't take effect, and will get capped to this value.</description>
</property>

<property>
<name>yarn.nodemanager.resource.memory-mb</name>
<value>96000</value>
<description>Physical memory, in MB, to be made available to running containers</description>
</property>

<property>
<name>yarn.nodemanager.resource.cpu-vcores</name>
<value>32</value>
<description>Number of CPU cores that can be allocated for containers.</description>
</property>

I also added Linux 90-nproc.conf as follow below:

*          soft    nproc     20000
root       soft    nproc     unlimited
*          soft    nofile    20000
*          hard    nofile    20000
root       soft    nofile    20000
root       hard    nofile    20000

but I still get a java heap space error on a terasort.

I don't have any issues with teragen.

The operating system is

RedHat 6.6
Kernel 3.18
11 machines
1 namenode
10 datanodes
Apache Hadoop 2.6

Solution

The memory limits you specify in mapred-site.xml need to be below yarn-site.xml memory settings then calculated w/system resources. I use a script for this for collecting the system specifics and create my core-site, mapred-site, hdfs-site and yarn-site.xml configuration.

Note: Mapreduce runs on top of yarn so remember to always have your memory below yarn-site.xml specifics. I can now run a mapred with 6 machines, 1TB teragen within 4 minutes and 57 seconds with my auto configuration script for Apache Hadoop 2.6.

I am very surprised on the performance of Apache Hadoop.