Java memory usage: Can someone explain the difference between memory reported by jconsole, ps, and prstat?

I'm investigating some memory bloat in a Java project. Confounded by the different statistics reported by different tools (we are using Java 8 on Solaris 10).

jconsole gives me three numbers:

Committed: the amount reserved for this process by the OS
Used: the amount actually being used by this process
Max: the amount available to the process (in our case it is limited to 128MB via Java command line option -Xmx128m).

For my project, jconsole reports 119.5MB max, 61.9MB committed, 35.5MB used.

The OS tools report something totally different:

ps -o vsz,rss and prstat -s rss and pmap -x all report that this process is using around 310MB virtual, 260MB physical

So my questions are:

Why does the OS report that I'm using around 5x as much as jconsole says is "committed" to my process?
Which of these measurements is actually accurate? (By "accurate", I mean, if I have 12GB of memory, can I run 40 of these (@ 300MB) before I hit OutOfMemoryException? Or can I run 200 of them (@ 60MB)? (Yes, I know I can't use all 12GB of memory, and yes I understand that virtual memory exists; I'm just using that number to illuminate the question better.)

Solution

This question goes quite deep. I'm just going to mention 3 of the many reasons:

VMs
Shared libraries
Stacks and permgen

VMs

Java is like a virtual mini computer. Imagine you ran an emulator on your computer that emulates an old macintosh computer, for example. The emulator app has a config screen where you set how much RAM is in the virtual computer. If you pick 1GB and start the emulator, your OS is going to say the 'Old Mac Emulator' application is taking 1GB. Eventhough inside the virtual machine, that virtual old mac might be reporting 800MB of 1GB free.

A JVM is the same thing. The JVM has its own memory management. As far as the OS is concerned, java.exe is an app that takes 1GB. As far as the JVM is concerned, there's 400MB available on the heap right now.

A JVM is slightly more convoluted, in that the total amount of memory a JVM 'claims' from the OS can fluctuate. Out of the box, a JVM will generally not ask for the maximum right away, but will ask for more over time before kicking in the garbage collector, or a combination thereof: Heap full? Garbage collect. That only freed up maybe 20% or so? Ask the OS for more. -Xms and -Xmx control this; set them to the same, and the JVM will on bootup ask for that much memory and will never ask for more. In general a JVM will never relinquish any memory it claimed.

JVMs, still, are primarily aimed at server deployments, where you want the RAM dedicated to your VM to be constant. There's little point in having each app take whatever they want when they want it, generally. In contrast to desktop apps where you tend to have a ton of apps running and given that a human is 'operating' it, generally only one app has particularly significant ram requirements.

This explains jconsole, which is akin to reporting the free memory inside the virtual old mac app: It's reporting on the state of the heap as the JVM sees it.

Whereas ps -o and friends are memory introspection tools at the OS level, and they just see the JVM as a big black box.

Which one is actually accurate

They both are. From their perspective, they are correct.

Shared library

OSes are highly complex beasts, these days. To put things in java terms, you can have a single JVM that is concurrently handling 100 simultaneous incoming https calls. One could want to see a breakdown of how much memory each of the currently 100 running 'handlers' is taking up. Okay... so how do we 'file' the memory load of String, the class itself (not any particular instance of String - the code. e.g. the instructions for how .toLowerCase() runs. Those are in memory too, someplace!). The web framework needs it, so does the core JVM, and so does probably every single last one of those 100 concurrent handlers. So how do we 'bookkeep' this?

In other words, the memory load on an entire system cannot be strictly divided up as 'that memory is 100% part of that app, and this memory is 10)% part of this app'. Shared libraries make that difficult.

The JVM is technically capable of rendering UIs, processing images, opening files both using the synchronous as well as the asynchronous API, and even the random access API if your OS offers a separate access library for it, sending network requests in async mode, in sync mode, and more. In effect, a JVM will immediately tell the OS: I can do allllll these things.

In my experience/recollection, most OSes report the total memory load of a single application as the sum of the memory they need as well as all the memory any (shared) library they load, in full.

That means ps and friends overreport JVMs considerably: The JVM loads in a ton of libraries. This doesn't actually cost RAM (The OS also loaded these libraries, the JVM doesn't use any large DLLs/.SO/.JNILIB files of its own, just hooks up the ones the OS provides, pretty much all of them), but is often 'bookkept' as such. You know this is happening if this trivial app:

class Test { public static void main(String[] args) throws Exception {
  System.out.println("Hello!");
  Thread.sleep(100000L);
}}

Already takes more than ~60MB or so.

I mean, if I have 12GB of memory, can I run 40 of these (@ 300MB)

That shared library stuff means each VM's memory load according to ps and friends are over-inflated by however much the shared libraries 'cost', because each JVM is going to share that library - the OS only loads it once, not 40 times.

Stacks and permgen

The 'heap', which is where newly created objects go, is the largest chunk of any JVM's memory load. It's also generally the only one JVM introspection tools like jconsole show you. However, it's not the only memory a JVM needs. There's a small slice it needs for its core self (the 'C code', so to speak). Each active thread has a stack and each stack also needs memory. By default it's whatever you pass to -Xss, but times the number of concurrent threads. But that's not a certainty: You can construct a new thread with an alternate size (check the constructors of j.l.Thread). There used to be 'permgen' which is where class code lived. Modern JVM versions got rid of it; in general newer JVM versions try to do more and more on heap instead of in magic hard-to-introspect things like permgen.

I mean, if I have 12GB of memory, can I run 40 of these (@ 300MB) before I hit OutOfMemoryException?

Run all 40 at once, and always specify both -Xms and -Xmx, setting them to equal sizes. Assuming all those 40 JVMs are relatively stable in terms of how many concurrent threads they ever run, if you're ever going to run into memory issues, it'll happen immediately (due to -Xms and -Xmx being equal you've removed the dynamism from this situation. All JVMs pretty much instaclaim all the memory they will ever claim, so it either 'works' or it won't. Stacks mess with the cleanliness of this somewhat, hence the caveat of stable-ish thread counts).