Search code examples
javaconstructornew-operator

Is that a correct depiction of the process of creating a new object in Java?


I'm new to Java and OOP in general. After numerous tries to figure out the process of creating a new object, I still have doubts about whether I understand correctly what exactly is happening inside (like 'What is the role of operator new?', 'Who calls the constructor?', 'How does the constructor know what object to initialize?' 'Is this present at all stages or not?', etc.).

Suppose we have a code:

class NewObject {
    private int varA;
    private int varB;

    public NewObject(int a, int b) {
        varA = a;
        varB = b;
    }
}

public class Test {
    public static void main(String args[]) {
        NewObject obj = new NewObject(3, 4);
    }
}

Behind the stage (I have highlighted in italics those places that cause me the most doubt):

  1. New reference variable obj of class NewObject is declared;
  2. Operator new asks Java for some heap memory to allocate an object using the declaration of a class NewObject as a blueprint.
  3. Operator new stores the address of the memory block provided by Java inside the variable obj
  4. Operator new then calls a constructor inside that newly created object and passess two numerical values (3 and 4, explicitly), as well as an address of this object (this implicitly) as arguments.
  5. Inside the object the constructor creates two local variables a and b and assigns them values received from new. Constructor also implicitly creates a local this to store the address of the object.
  6. Constructor sees varA and varB inside its body but doesn't see explicit this attached to them so it first treats them as local variables. Since it can't find corresponding declarations of these local variables, it then thinks that they must be instance variables.
  7. Constructor thus searches for implicit this, and when it finds this, it uses its value as a reference (address) to an object which instance variables must be initialized with values from its local variables.

Is it correct or do I miss something? Thank you!


Solution

  • Let's take a look at the decompiled bytecode of your main method (I left out some less relevant pieces):

      public static void main(java.lang.String[]);
        Code:
          stack=4, locals=2, args_size=1
             0: new           #2                  // class NewObject
             3: dup
             4: iconst_3
             5: iconst_4
             6: invokespecial #3                  // Method NewObject."<init>":(II)V
             9: astore_1
            10: return
    
    • #0 says "allocate a new object of the type referenced by #2 (which the comment next to it nicely tells us is the class NewObject)
    • #3 says "duplicate the latest value on the stack" (which happens to be a reference to the newly allocated object). Now the stack contains 2 references to the new object.
    • #4 and #5 put the numbers 3 and 4 on the stack
    • #6 invokes the constructor via the reference #3 using invokespecial. This will take its required arguments from the stack and also pop them from the stack (the last 3 values on the stack are a reference to the new object and the number 3 and 4)
    • #9 will store the remaining value from the stack in the local variable #1 (which is a reference to the new object)
    • #10 indicates that the main method is done and return to whatever called it.

    So the new bytecode only ensures that memory for the object is created, it is up to the bytecode after it to actually call the constructor.

    Note that this might mean that you could in theory create un-initialized objects and pass them around, but the Java runtime runs a step called "verification" on the bytecode that it loads to verify that things like this can never happen (i.e. you can't just call new and return the value, the runtime will decline to load a class that tries to do that).

    Also note that the step in #9 is basically pointless as we write to a local variable that is never read. This indicates that javac is not an optimizing compiler: it translates the Java source code quite directly and does not attempt to do any optimizations on it. Optimizations like dropping that store operation usually happen during runtime.

    If we look at the bytecode of the NewObject method we'll see this (some pieces removed):

      public NewObject(int, int);
        Code:
          stack=2, locals=3, args_size=3
             0: aload_0
             1: invokespecial #1                  // Method java/lang/Object."<init>":()V
             4: aload_0
             5: iload_1
             6: putfield      #2                  // Field varA:I
             9: aload_0
            10: iload_2
            11: putfield      #3                  // Field varB:I
            14: return
    

    Note that args_size=3 tells us that the method expects 3 values on the stack (this and the 2 real arguments). This means that at this level the this reference is treated just like any other parameter.

    • in lines #0 and #1 we load this on the stack and call the super constructor (of `Object)
    • lines #4 and #5 load this and the first argument a and line #6 will set the field #2 (which is a reference to varA of the object reference by this to the value of a)
    • lines #9-11 do the same for b
    • line #14 marks the end of the constructor.