Search code examples
javamemory-managementgarbage-collectionjvmjvm-hotspot

What is the purpose of Survivor Space in Java memory?


Tried looking up for this, but all the questions/answers I came across talk about the purpose of having 2 survivor spaces. I would like to understand the purpose of having survivor space in general. How does moving objects from Eden to Survivor benefit?


Solution

  • Performance.

    In general, splitting the heap ( be that generational or any other discriminator ), was seen as a rather good thing, not all collectors follow that though ( Shenandoah is not such a collector for example ).

    Why is that good thing? It takes time to scan the entire heap for alive Objects. How do you tell your garbage collector - "time to run now". When is that time? You could say : run after every 100-th allocated Object. Is that too soon? ( what if the size of these objects is only a tiny fraction of the heap ) or worse : is it too late? What if you say: trigger a collection at 65% of the heap occupancy ( G1 triggers a major collection at that percentage, among other possibilities, by default ). What if at that 65% you find out that the majority of Objects should have been collected a lot earlier, they have been staying in the heap for far too much time.

    You can see that this becomes complicated fast. Things get worse when you understand that scanning the heap takes time, and the last thing you want is for you application to stall, when GC is running. But please also bear in mind that there are collectors that scan the heap concurently, so they don't have this problem ( Shenandoah, ZGC or C4 ).

    If you could separate the heap, you could scan only a portion of it, thus taking little time. People call them "minor" collections. Some collectors thus divide the heap in "young" and "old", this separation comes on the premises of "infant mortality" : young objects die fast. Thus, if you do this separation + young objects die soon, you can scan only a certain portion of the heap and in the majority of cases only deal with that. This also simplifies the answer of : when a GC is supposed to run? When young is full, of course.

    And now to your direct point: why is a Survivor needed, at all. Let's assume it isn't there. The first GC cycle happens ( young region is full, let's call it Eden to be exact), what happens next? GC needs to tell what is alive there, move it to "old generation", clear Eden and start allocating again. Second cycle comes in and does the same thing and so on, until GC says : "old generation if full, I can't move anymore". This is the place where an famous "old generation" happens. It's usually costly.

    But we do know about "infant mortality" here. We do know that the second and third GC cycle moved some objects to the old generation that would have been collected at the fourth phase. This opportunity was missed. As such : Survivor space. It keeps objects in there for "a little longer" then a single GC cycle ( called survivor age ), knowing that in the nearest future this will become garbage. Thus, no need to scan the old often, only scan and take care of a smaller portion of the heap (Eden and Survivor ). As to why there are two Survivor spaces, its a separate question...

    In reality, latest GCs don't need that. They found a way to scan the heap concurently, while your application is running, so they don't have these spaces. The premises of young death still exists, and some GC algorithms might use that; now or in the future.