Search code examples
javajvminitializationnativesystem

Initialization of static final Fields in Java's System Class in JDK 14 and Beyond


While examining the initialization of the System class in JDK 14 and later, it's evident that the standard input/output streams are set up by invoking the registerNatives() method, which is called from a static initialization block and is itself native. However, what catches the eye is that the variables out, in, and err are defined as static and final. Being final, these fields require explicit initialization with a constant value. Initially, they are set to null within the class.

Considering that their initialization might be handled by a native method, the explicit assignment to null doesn't simply vanish from the code, and logically, this part of the code is expected to execute as well. Furthermore, if we assume that this whole process takes place behind the scenes in the <clinit> method, the native method would complete first, followed by the field initializer that assigns null (for example, looking at the out field). How is null effectively overridden, and where does this actual initialization occur?

Moreover, if we delve into the comments, it becomes apparent that the initialization of this class is somehow separated from the regular execution of <clinit>:

    /* Register the natives via the static initializer.
     *
     * The VM will invoke the initPhase1 method to complete the initialization
     * of this class separate from <clinit>.
     */
    private static native void registerNatives();
    
    static {
        registerNatives();
    }

What is initPhase1? What happens in this phase, and what exactly does it entail? I'm incredibly curious! Thanks to everyone in advance!


Solution

  • While examining the initialization of the System class in JDK 14 and later, it's evident that the standard input/output streams are set up by invoking the registerNatives() method, [...]

    No, these are not initialized from registerNatives(). They are initialized in the initPhase1() method, which is a regular Java method (in the System class) that is called from native code after the System class has been initialized (i.e. after those fields have been set to null).

    The call of initPhase1() is done in a C++ method named initialize_java_lang_classes.


    The initialize_java_lang_classes method initializes the System class (among other classes). This initialization executes all static code blocks and all static initalizers.

    Later on it executes initPhase1() which is therefore after class initialization.


    Assigning to final fields is prohibited for Java code, but native code can still change these fields (see the source for setIn0() on how this is done). The corresponding user-accessible methods (like setIn(), setOut() and setErr()) exist since Java 1.1. I can only assume that in Java 1.0 you could not change System.in and System.out.


    What does registerNatives() do?

    If you look at the source for registerNatives() it registers some of the native methods of the System class with the JVM. From the comment just above that method it does this for performance reasons.

    I've linked the registerNatives() method in the block before, I include it here for completeness. Clearly it doesn't do anything like calling initPhase1():

    /* Only register the performance-critical methods */
    static JNINativeMethod methods[] = {
        {"currentTimeMillis", "()J",              (void *)&JVM_CurrentTimeMillis},
        {"nanoTime",          "()J",              (void *)&JVM_NanoTime},
        {"arraycopy",     "(" OBJ "I" OBJ "II)V", (void *)&JVM_ArrayCopy},
    };
    
    #undef OBJ
    
    JNIEXPORT void JNICALL
    Java_java_lang_System_registerNatives(JNIEnv *env, jclass cls)
    {
        (*env)->RegisterNatives(env, cls,
                                methods, sizeof(methods)/sizeof(methods[0]));
    }
    

    The code block from the System.java source:

        /* Register the natives via the static initializer.
         *
         * The VM will invoke the initPhase1 method to complete the initialization
         * of this class separate from <clinit>.
         */
        private static native void registerNatives();
        static {
            registerNatives();
        }
    

    seems to fool you: you are thinking that initPhase1 is somehow called from registerNatives(). But as the source code for registerNatives() shows this is not the case.

    And also: registerNatives() is called from a static initializer block which means it is called from <clinit>. If registerNatives() really did call initPhase1() that would mean it would be called indirectly from <clinit> which contradicts the comment: The VM will invoke the initPhase1 method [...] separate from <clinit>


    As for the code of the native setIn0() method (also already linked above), this is also found in System.c:

    /*
     * The following three functions implement setter methods for
     * java.lang.System.{in, out, err}. They are natively implemented
     * because they violate the semantics of the language (i.e. set final
     * variable).
     */
    JNIEXPORT void JNICALL
    Java_java_lang_System_setIn0(JNIEnv *env, jclass cla, jobject stream)
    {
        jfieldID fid =
            (*env)->GetStaticFieldID(env,cla,"in","Ljava/io/InputStream;");
        if (fid == 0)
            return;
        (*env)->SetStaticObjectField(env,cla,fid,stream);
    }
    

    (I did not bother to copy the code for setOut0 and setErr0 since their code is the same except for the names of the fields.)

    Note that comment above the native functions: it clearly states their purpose and that these functions do something the Java language does not allow.