Search code examples
javastringstring-interning

How many Strings are getting created with the new operator


How many Strings are getting created with new operator.

let say I am creating a string with new operator.

String str = new String("Cat")

Will it create 2 strings one in heap and other one is in string pool?

if it creates string in string poll as well then what is the purpose of string intern method ?


Solution

  • How many objects?

    Will it create 2 strings one in heap and other one is in string pool?

    When you write "Cat", you end up populating the pool with Cat and the "Cat" call loads this object from the pool. This typically happens already at compile time. Then new String(...) will create a new string object, ignoring the pool completely.

    So this snippet leads to the creation of two objects. To clear up your confusion, consider the following example:

    String first = "Cat";
    String second = "Cat";
    String third = "Cat";
    String fourth = new String("Cat");
    

    Here, two objects are created as well. All the "Cat" calls will load the string out of the pool, so first == second == third and fourth will be its own object since it used new, which always leads to a creation of a new object, bypassing any sort of caching mechanisms.

    Whether the objects are created on the heap or stack is not really defined. Memory management is totally up to the JVM.


    String pool details

    For most of Java implementations, the string pool is created and populated already during compilation. When you write "Cat", the compiler will put a string object representing Cat into this pool and the "Cat" in your code will be replaced by loading this object from the pool. You can see this easily when you disassemble a compiled program. For example, source code:

    public class Test {
        public static void main(String[] args) {
            String foo = "Hello World";
        }
    }
    

    disassembly (javap -v):

    Classfile /C:/Users/Zabuza/Desktop/Test.class
      Last modified 30.03.2021; size 277 bytes
      SHA-256 checksum 83de8a7326af14fc95fb499af090f9b3377c56f79f2e78b34e447d66b645a285
      Compiled from "Test.java"
    public class Test
      minor version: 0
      major version: 59
      flags: (0x0021) ACC_PUBLIC, ACC_SUPER
      this_class: #9                          // Test
      super_class: #2                         // java/lang/Object
      interfaces: 0, fields: 0, methods: 2, attributes: 1
    Constant pool:
       #1 = Methodref          #2.#3          // java/lang/Object."<init>":()V
       #2 = Class              #4             // java/lang/Object
       #3 = NameAndType        #5:#6          // "<init>":()V
       #4 = Utf8               java/lang/Object
       #5 = Utf8               <init>
       #6 = Utf8               ()V
       #7 = String             #8             // Hello World
       #8 = Utf8               Hello World
       #9 = Class              #10            // Test
      #10 = Utf8               Test
      #11 = Utf8               Code
      #12 = Utf8               LineNumberTable
      #13 = Utf8               main
      #14 = Utf8               ([Ljava/lang/String;)V
      #15 = Utf8               SourceFile
      #16 = Utf8               Test.java
    {
      public Test();
        descriptor: ()V
        flags: (0x0001) ACC_PUBLIC
        Code:
          stack=1, locals=1, args_size=1
             0: aload_0
             1: invokespecial #1                  // Method java/lang/Object."<init>":()V
             4: return
          LineNumberTable:
            line 1: 0
    
      public static void main(java.lang.String[]);
        descriptor: ([Ljava/lang/String;)V
        flags: (0x0009) ACC_PUBLIC, ACC_STATIC
        Code:
          stack=1, locals=2, args_size=1
             0: ldc           #7                  // String Hello World
             2: astore_1
             3: return
          LineNumberTable:
            line 3: 0
            line 4: 3
    }
    SourceFile: "Test.java"
    

    As you see, there is

    #7 = String             #8             // Hello World
    #8 = Utf8               Hello World
    

    and the section in the method is replaced by

    0: ldc           #7
    

    which loads Hello World from the string pool.


    String interning

    what is the purpose of string intern method?

    Well, it gives you the possibility to swap out your string against the version from the pool. And to populate the pool with your string in case it did not exist there before. For example:

    String first = "hello"; // from pool
    String second = new String("hello"); // new object
    String third = second.intern(); // from pool, same as first
    
    System.out.println(first == second); // false
    System.out.println(first == third); // true
    

    Use case

    I have not seen a real world application for this feature yet though.

    However, I could think of an use-case where you create, possibly long strings dynamically in your application and you know that they will re-occur later again. Then you can put your strings into the pool in order to trim down the memory footprint when they occur again later on.

    So lets say you receive some long strings from HTTP responses and you know that the responses, most of the time, are the exact same and you also want to collect them in a List:

    private List<String> responses = new ArrayList<>();
    
    ...
    
    public void receiveResponse(String response) {
        ...
        responses.add(response);
    }
    

    Without interning, you would end up keeping each string instance, including duplicates, alive in your memory. If you intern them however, you would not have duplicate string objects in memory:

    public void receiveResponse(String response) {
        ...
        String responseFromPool = response.intern();
        responses.add(responseFromPool);
    }
    

    Of course, this is a bit contrived as you could also just use a Set instead here.