Search code examples
javalinuxunixmemory

Growing resident memory usage (RSS) of Java Process


Our recent observation on our production system, tells us the resident memory usage of our Java container grows up. Regarding this problem, we have made some investigations to understand, why java process consumes much more memory than Heap + Thread Stacks + Shared Objects + Code Cache + etc, using some native tools like pmap. As a result of this, we found some 64M memory blocks (in pairs) allocated by native process (probably with malloc/mmap) :

0000000000400000      4K r-x--  /usr/java/jdk1.7.0_17/bin/java
0000000000600000      4K rw---  /usr/java/jdk1.7.0_17/bin/java
0000000001d39000   4108K rw---    [ anon ]
0000000710000000  96000K rw---    [ anon ]
0000000715dc0000  39104K -----    [ anon ]
00000007183f0000 127040K rw---    [ anon ]
0000000720000000 3670016K rw---    [ anon ]
00007fe930000000  62876K rw---    [ anon ]
00007fe933d67000   2660K -----    [ anon ]
00007fe934000000  20232K rw---    [ anon ]
00007fe9353c2000  45304K -----    [ anon ]
00007fe938000000  65512K rw---    [ anon ]
00007fe93bffa000     24K -----    [ anon ]
00007fe940000000  65504K rw---    [ anon ]
00007fe943ff8000     32K -----    [ anon ]
00007fe948000000  61852K rw---    [ anon ]
00007fe94bc67000   3684K -----    [ anon ]
00007fe950000000  64428K rw---    [ anon ]
00007fe953eeb000   1108K -----    [ anon ]
00007fe958000000  42748K rw---    [ anon ]
00007fe95a9bf000  22788K -----    [ anon ]
00007fe960000000   8080K rw---    [ anon ]
00007fe9607e4000  57456K -----    [ anon ]
00007fe968000000  65536K rw---    [ anon ]
00007fe970000000  22388K rw---    [ anon ]
00007fe9715dd000  43148K -----    [ anon ]
00007fe978000000  60972K rw---    [ anon ]
00007fe97bb8b000   4564K -----    [ anon ]
00007fe980000000  65528K rw---    [ anon ]
00007fe983ffe000      8K -----    [ anon ]
00007fe988000000  14080K rw---    [ anon ]
00007fe988dc0000  51456K -----    [ anon ]
00007fe98c000000  12076K rw---    [ anon ]
00007fe98cbcb000  53460K -----    [ anon ]

I interpret the line with 0000000720000000 3670016K refers to the heap space, of which size we define using JVM parameter "-Xmx". Right after that, the pairs begin, of which sum is 64M exactly. We are using CentOS release 5.10 (Final) 64-bit arch and JDK 1.7.0_17 .

The question is, what are those blocks? Which subsystem does allocate these?

Update: We do not use JIT and/or JNI native code invocations.


Solution

  • It's also possible that there is a native memory leak. A common problem is native memory leaks caused by not closing a ZipInputStream/GZIPInputStream.

    A typical way that a ZipInputStream is opened is by a call to Class.getResource/ClassLoader.getResource and calling openConnection().getInputStream() on the java.net.URL instance or by calling Class.getResourceAsStream/ClassLoader.getResourceAsStream. One must ensure that these streams always get closed.

    Some commonly used open source libraries have had bugs that leak unclosed java.util.zip.Inflater or java.util.zip.Deflater instances. For example, Nimbus Jose JWT library has fixed a related memory leak in 6.5.1 version. Java JWT (jjwt) had a similar bug that was fixed in 0.10.7 version. The bug pattern in these 2 cases was the fact that calls to DeflaterOutputStream.close() and InflaterInputStream.close() do not call Deflater.end()/Inflater.end() when an Deflater/Inflater instance is provided. In those cases, it's not enough to check the code for streams being closed. Every Deflater/Inflater instances created in the code must have handling that .end() gets called.

    One way to check for Zip*Stream leaks is to get a heap dump and search for instances of any class with "zip", "Inflater" or "Deflater" in the name. This is possible in many heap dump analysis tools such as Yourkit Java Profiler, JProfiler or Eclipse MAT. It's also worth checking objects in finalization state since in some cases memory is released only after finalization. Checking for classes that might use native libraries is useful. This applies to TLS/ssl libraries too.

    There is an OSS tool called leakchecker from Elastic that is a Java Agent that can be used to find the sources of java.util.zip.Inflater instances that haven't been closed (.end() not called).

    For native memory leaks in general (not just for zip library leaks), you can use jemalloc to debug native memory leaks by enabling malloc sampling profiling by specifying the settings in MALLOC_CONF environment variable. Detailed instructions are available in this blog post: http://www.evanjones.ca/java-native-leak-bug.html . This blog post also has information about using jemalloc to debug a native memory leak in java applications. There's also a blog post from Elastic featuring jemalloc and mentioning leakchecker, the tool that Elastic has opensourced to track down problems caused by unclosed zip inflater resources.

    There is also a blog post about a native memory leak related to ByteBuffers. Java 8u102 has a special system property jdk.nio.maxCachedBufferSize to limit the cache issue described in that blog post.

    -Djdk.nio.maxCachedBufferSize=262144
    

    It's also good to always check open file handles to see if the memory leak is caused by a large amount of mmap:ed files. On Linux lsof can be used to list open files and open sockets:

    lsof -Pan -p PID
    

    The report of the memory map of the process could also help investigate native memory leaks

    pmap -x PID
    

    For Java processes running in Docker, it should be possible to execute the lsof or pmap command on the "host". You can find the PID of the containerized process with this command

    docker inspect --format '{{.State.Pid}}' container_id
    

    It's also useful to get a thread dump (or use jconsole/JMX) to check the number of threads since each thread consumes 1MB of native memory for its stack. A large number of threads would use a lot of memory.

    There is also Native Memory Tracking (NMT) in the JVM. That might be useful to check if it's the JVM itself that is using up the native memory.

    AsyncProfiler can be used to detect the source of native memory allocations. This is explained in another answer.

    The jattach tool can be used also in containerized (docker) environment to trigger threaddumps or heapdumps from the host. It is also able to run jcmd commands which is needed for controlling NMT.