Search code examples
javagarbage-collectionjvm

How GC finds GC roots and other object references


According to this article How Garbage Collection Works there are four kinds of Gc roots:

  • Local variables are kept alive by the stack of a thread. This is not a real object virtual reference and thus is not visible. For all intents and purposes, local variables are GC roots.
  • Active Java threads are always considered live objects and are therefore GC roots. This is especially important for thread local
    variables.
  • Static variables are referenced by their classes. This fact makes them de facto GC roots. Classes themselves can be garbage-collected, which would remove all referenced static variables. This is of special importance when we use application servers, OSGi containers or class loaders in general. We will discuss the related problems in the Problem Patterns section.
  • JNI References are Java objects that the native code has created as part of a JNI call. Objects thus created are treated specially because the JVM does not know if it is being referenced by the native code or not. Such objects represent a very special form of GC root, which we will examine in more detail in the Problem Patterns section below.

In JVM specification local variables in the frames of stack have no type and it's just somehow an array of bytes and it's the responsibility of compiler to generate type specific instruction for those local variables for instance iload, fload, aload, etc. So clearly GC can not find references to object by only looking at local variable section of the stack frames.

My questions are :

  1. How GC finds those roots at all?

  2. How GC can find local variables in the stack that are references to object and are not other type of variables (for instance variables that have been stored by iconst)?

  3. Then How Gc finds fields of those objects to create an accessible tree?

  4. Does it use instruction that are defined by JVM itself to find those objects?

  5. And lastly what is the meaning of this sentence in the article?

This is not a real object virtual reference and thus is not visible


Solution

    1. How GC finds those roots at all?

    A JVM provides an internal (C / C++) API for finding the roots. The JVM knows where the Java stack frames are, were the static frames for Java each class are, where the live JNI object handles are, and so on. (It knows because it was involved in creating them, and keeps track of them.)

    1. How GC can find local variables in the stack that are references to object and are not other type of variables.

    The JVM keeps information for each method that says which cells in each stack frame are reference variables. The GC can figure out which method each stack frame corresponds to ... just like fillInStackTrace can.

    (for instance constant)

    That's not actually relevant. Constants (i.e. final fields) don't get special treatment by the GC.

    1. Then how GC finds fields of those objects to create an accessible tree?

    The JVM keeps information for each class to say which of the static and instance fields are reference variables. There is an field in each object's header that refers to the class.

    The whole process is called "marking", and it is described in the page you were looking at.

    1. Does it use instruction that are defined by jvm itself to find those objects?

    I'm not sure what you are asking. But "probably yes". The GC is a component of the JVM, so everything is does is "defined by the JVM".

    1. And lastly what is the meaning of this sentence in the article?

    This is not a real object virtual reference and thus is not visible

    It might be saying that the thread's stack is not a Java object ... which is true. But I think you would need to ask the authors of that Ebook; see the bottom of https://www.dynatrace.com/resources/ebooks/javabook/ for their names.


    You added this:

    In JVM specification local variables in the frames of stack have no type and it's just somehow an array of bytes and it's the responsibility of compiler to generate type specific instruction for those local variables for instance iload, fload, aload, etc. So clearly GC can not find references to object by only looking at local variable section of the stack frames.

    Actually, that is not true. As @Holder reminded me, the verifier infers the types of the cells in the stackframe by simulating the effects of the bytecodes that initialize them.

    Later on, the GC can obtain the inferred type information from the JVM.

    (In theory the GC could also make use of the StackMapTable information to determine when local variables go out of scope ... within a method. But apparently it doesn't in HotSpot JVMs; see Does the StackMapTable affect the garbage collection behavior?)


    The description of garbage collection in that Ebook is (deliberately) brief and high level. But that is true of most descriptions that you will find. The deep details are complicated.

    If your really want (and need) to understand how GC's work, my advice is:

    • To find out how the current Java implementations work read the OpenJDK source code.
    • Track down and read the Sun and Oracle research papers on Java GCs.
    • Get hold of a copy of a good textbook on Garbage Collection.