Search code examples
javamemoryjvmstack

What are JVM's rules for variables allocations on stack?


A Long object on a 64 bit machine will likely take 24 bytes: 12 for object header, 8 for long value itself and another 4 bytes for padding. I easily googled this, but I couldn't find it in JVM Specification.

What I couldn't find anywhere is what are primitive variable types sizes when put on stack. Are they padded at all? What are the rules? For example, what would be the stack memory footprint of the variables in the following method:

class MemoryTest{
  static void foo() {
    int anInt=0;
    long aLong=0L;
    byte aByte=0;
    short aShort=0;
    // some code that uses the vars above
  }
}

Solution

  • I easily googled this, but I couldn't find it in JVM Specification.

    Well, that is because it is an implementation detail. Everything the spec doesn't say is room for the implementation to do its own thing. In this case, the spec specifically calls this out in 2.7:

    The Java Virtual Machine does not mandate any particular internal structure for objects.

    The JVM Data Types

    The data types of the JVM are defined in 2.3.

    The integral types are:

    • byte, whose values are 8-bit signed two's-complement integers, and whose default value is zero

    • short, whose values are 16-bit signed two's-complement integers, and whose default value is zero

    • int, whose values are 32-bit signed two's-complement integers, and whose default value is zero

    • long, whose values are 64-bit signed two's-complement integers, and whose default value is zero

    • char, whose values are 16-bit unsigned integers representing Unicode code points in the Basic Multilingual Plane, encoded with UTF-16, and whose default value is the null code point ('\u0000')

    The floating-point types are:

    • float, whose values exactly correspond to the values representable in the 32-bit IEEE 754 binary32 format, and whose default value is positive zero

    • double, whose values exactly correspond to the values of the 64-bit IEEE 754 binary64 format, and whose default value is positive zero

    They just have the sizes that you expect, no headers like the wrapper classes. However, the JVM instruction set is limited and there aren't every type of instruction for every single type. In fact, most of the instructions are for int, long, float, double. (See this table for more info) So in the end you'd almost always be working with those types anyway.

    The Operand Stack and the Local Variables

    Note that values of primitive types are not directly "put on the stack". The JVM stack stores frames, which are "used to store data and partial results, as well as to perform dynamic linking, return values for methods, and dispatch exceptions."

    On the frame, there is an "operand stack", where partial results are stored. For example, an implementation might push the 0s in your code onto the operand stack first, and then pop them into the local variables.

    The spec says the following about the sizes of the things on the operand stack:

    Each entry on the operand stack can hold a value of any Java Virtual Machine type, including a value of type long or type double.

    [...]

    At any point in time, an operand stack has an associated depth, where a value of type long or double contributes two units to the depth and a value of any other type contributes one unit.

    According to the spec, the local variables are also stored in the frame, in an array. The parts relevant to what you are asking are:

    A single local variable can hold a value of type boolean, byte, char, short, int, float, reference, or returnAddress. A pair of local variables can hold a value of type long or double.

    [...]

    A value of type long or type double occupies two consecutive local variables. Such a value may only be addressed using the lesser index. For example, a value of type double stored in the local variable array at index n actually occupies the local variables with indices n and n+1; however, the local variable at index n+1 cannot be loaded from. It can be stored into. However, doing so invalidates the contents of local variable n.

    The Java Virtual Machine does not require n to be even. In intuitive terms, values of types long and double need not be 64-bit aligned in the local variables array. Implementors are free to decide the appropriate way to represent such values using the two local variables reserved for the value.

    Your Example

    As for your code, we need to make some reasonable assumptions. Let's assume that the following class file (as viewed from javap) is generated. I compiled this with my javac using javac -g MemoryTest.java.

    Classfile /Users/mulangsu/Desktop/MemoryTest.class
      Last modified 2022/09/24; size 264 bytes
      SHA-256 checksum c1aa63404c5e590ef52116de586a91d75deb8727ab749cbd1e0d6fb4197357c2
      Compiled from "MemoryTest.java"
    class MemoryTest
      minor version: 0
      major version: 61
      flags: (0x0020) ACC_SUPER
      this_class: #7                          // MemoryTest
      super_class: #2                         // java/lang/Object
      interfaces: 0, fields: 0, methods: 2, attributes: 1
    Constant pool:
       #1 = Methodref          #2.#3          // java/lang/Object."<init>":()V
       #2 = Class              #4             // java/lang/Object
       #3 = NameAndType        #5:#6          // "<init>":()V
       #4 = Utf8               java/lang/Object
       #5 = Utf8               <init>
       #6 = Utf8               ()V
       #7 = Class              #8             // MemoryTest
       #8 = Utf8               MemoryTest
       #9 = Utf8               Code
      #10 = Utf8               LineNumberTable
      #11 = Utf8               foo
      #12 = Utf8               SourceFile
      #13 = Utf8               MemoryTest.java
    {
      MemoryTest();
        descriptor: ()V
        flags: (0x0000)
        Code:
          stack=1, locals=1, args_size=1
             0: aload_0
             1: invokespecial #1                  // Method java/lang/Object."<init>":()V
             4: return
          LineNumberTable:
            line 1: 0
    
      static void foo();
        descriptor: ()V
        flags: (0x0008) ACC_STATIC
        Code:
          stack=2, locals=5, args_size=0
             0: iconst_0
             1: istore_0
             2: lconst_0
             3: lstore_1
             4: iconst_0
             5: istore_3
             6: iconst_0
             7: istore        4
             9: return
          LineNumberTable:
            line 3: 0
            line 4: 2
            line 5: 4
            line 6: 6
            line 8: 9
          LocalVariableTable:
            Start  Length  Slot  Name   Signature
                2       8     0 anInt   I
                4       6     1 aLong   J
                6       4     3 aByte   B
                9       1     4 aShort   S
    }
    SourceFile: "MemoryTest.java"
    

    Interestingly, all the instructions used are int and long instructions. There are no bipush or sipush, even though the compiler could have used it, but I guess that would make the class file slightly bigger.

    This is the "put 0s on the operand stack then pop it to the local variable" behaviour I described earlier.

    If we also assume that each local variable slot is implemented as 4 bytes, then the local variables would occupy 20 bytes in total. anInt, aByte, aShort each occupies one slot, and aLong occupies 2, so 5 slots in total.