Search code examples
javajvmresolutiondynamic-linkingsymbolic-references

Where the resolved reference(that means direct memory address against symbolic reference) stored in JVM after resolution?


I've studied about JVM(especially JDK 8 version) and while studying about class linking, I've not figured out where a direct memory address that was determined from symbolic reference in resolution.

There are several kinds of resolutions, such as type(class/interface), field, method, etc., but I've just do a class example for simple explanation.

In JVM specification, there some words.

5.1 The Run-Time Constant Pool The Java Virtual Machine maintains a per-type constant pool (§2.5.5), a run-time data structure that serves many of the purposes of the symbol table of a conventional programming language implementation. The constant_pool table (§4.4) in the binary representation of a class or interface is used to construct the run-time constant pool upon class or interface creation (§5.3). All references in the run-time constant pool are initially symbolic.

the specification said, All references are symbolic reference at first.

Here's a sample Main class.

public class Main {
    public static void main(String[] args) {
        Object obj = new Object();
    }
}

Here's the Constant pool info of the Main class.

Constant pool:
#1 = Methodref          #2.#12         // java/lang/Object."<init>":()V
#2 = Class              #13            // java/lang/Object
#3 = Class              #14            // Main
#4 = Utf8               <init>
#5 = Utf8               ()V
#6 = Utf8               Code
#7 = Utf8               LineNumberTable
#8 = Utf8               main
#9 = Utf8               ([Ljava/lang/String;)V
#10 = Utf8               SourceFile
#11 = Utf8               Main.java
#12 = NameAndType        #4:#5          // "<init>":()V
#13 = Utf8               java/lang/Object
#14 = Utf8               Main
{
  public Main();
    descriptor: ()V
    flags: ACC_PUBLIC
    Code:
      stack=1, locals=1, args_size=1
         0: aload_0
         1: invokespecial #1                  // Method     java/lang/Object."<init>":()V
         4: return
      LineNumberTable:
        line 1: 0

  public static void main(java.lang.String[]);
    descriptor: ([Ljava/lang/String;)V
    flags: ACC_PUBLIC, ACC_STATIC
    Code:
      stack=2, locals=2, args_size=1
         0: new           #2                  // class java/lang/Object
         3: dup
         4: invokespecial #1                  // Method java/lang/Object."<init>":()V
         7: astore_1
         8: return
      LineNumberTable:
        line 3: 0
        line 4: 8
}
SourceFile: "Main.java"

4.4.1 The CONSTANT_Class_info Structure
The CONSTANT_Class_info structure is used to represent a class or an > interface:
CONSTANT_Class_info {
   u1 tag;
   u2 name_index;
}

Here, Object class is refered in main method of Main class. In Main class, Object class is never refered.(when the command java Main is just excuted;) That means Object Class entry(here, #2: CONSTANT_Class_info structure.) in Main's constant pool has name_index #13. #13 is CONSTANT_Utf8_info strtucture containing the name of Object class and #13 is the symbolic reference of Object class.(Honestly, I might not sure this Utf8 constant pool entry is symbolic reference of #2(Object's Class pool entry))

When JVM's the execution engine just executes a bytecode which has Object class's reference(in this clas, 0: new #2), #2 references #13(symbolic reference). So, it needs to be resolved to the direct address of Object Class on Method Area in JVM. And Class resolution occurs.

Here's the question. I've read and searched on JVM specs, and blogs, article, but I couldn't find where the resolved direct memory address for symbolic reference stores in JVM.

I found some information in a blog, it said,

Binding is the process of the field, method or class identified by the symbolic reference being replaced by a direct reference, this only happens once because the symbolic reference is completely replaced.

It said, replaced. In #2 constant pool entry, the symbolic reference of Object class is stored in name_index(u2 type) field of CONSTANT_Class_info structure.

Is the value of name_index field changed to the direct memory address of Object Class(maybe in runtime constant pool for Object clsas in Method Area)????

If not, where the direct address stored?

Please give me the answer. Thank you.


Solution

  • The specification does not say where JVM stores resolved constant pool entries. It is implementation-specific detail.

    In HotSpot JVM the constant pool resides in the Metaspace. It consists of two related arrays: an array of tags and an array of values. The tags describe the types of the corresponding values. But these are not the same tags as defined in JVMS §4.4. JVM fills the constant pool with its own tags during class file parsing stage.

    There are 4 different types of constant pool entry that denote a reference to Java Class:

    • JVM_CONSTANT_ClassIndex initially contains an integer index to a constant pool Utf8 entry with the class name.
    • JVM_CONSTANT_UnresolvedClass. After the initial contents of the constant pool is completely loaded, JVM changes JVM_CONSTANT_ClassIndex tags to JVM_CONSTANT_UnresolvedClass and replaces correponding cp entries with symbolic names.
    • JVM_CONSTANT_UnresolvedClassInError means the same as JVM_CONSTANT_UnresolvedClass, but denotes that the class resolution attempt has failed.
    • JVM_CONSTANT_Class is a raw address to the internal representation of the resolved class.

    So, your guess was correct: during constant pool resolution HotSpot JVM modifies cp entries in place and changes the corresponding cp tags. That is, JVM_CONSTANT_UnresolvedClass becomes JVM_CONSTANT_Class, and the symbolic reference is replaced with the direct address right in the same array of constant pool values.

    You can find the implementation in ConstantPool::klass_at_impl.