Search code examples
javastringstring-interningstring-pool

Is String Pool really empty initially as mentioned in the Javadoc of String.intern() method?


Below is the Javadoc comment for String.intern() method:

*Returns a canonical representation for the string object.

A pool of strings, initially empty, is maintained privately by the class String.

When the intern method is invoked, if the pool already contains a string equal to this String object as determined by the equals(Object) method, then the string from the pool is returned. Otherwise, this String object is added to the pool and a reference to this String object is returned.

It follows that for any two strings s and t, s.intern() == t.intern() is true if and only if s.equals(t) is true.

All literal strings and string-valued constant expressions are interned. String literals are defined in section 3.10.5 of the The Java™ Language Specification.”

But I think something changed jdk-8u102 onwards.

Check below example:

public class Test1 {
    public static void main(String[] args) {
        String s1 = new String(new char[]{'J', 'a', 'v', 'a'});
        String s2 = s1.intern(); 
        System.out.println(s1 == s2);
    }
}

If you run above program in JDK 7u80 (Last stable release of JDK 7) and JDK 8 till 8u101, then output is:
true

But if you run above program in JDK 8u102 onwards and in JDK 9 & JDK 10, then output is:
false

Why the intern() method started behaving differently JDK 8u102 onwards?

I checked the release notes and Javadoc comments but couldn’t find anything about changes related to intern() method in JDK 8u102.

I checked blogs and other websites but no luck.

But when I tried with some other string, then there was no change in the output:

public class Test2 {
    public static void main(String[] args) {
        String s3 = new String(new char[]{'U', 'd', 'a', 'y', 'a', 'n'});
        String s4 = s3.intern();
        System.out.println(s3 == s4);
    }
}

Above program always prints true in JDK 7, JDK 8, JDK 9 and JDK 10.

This behaviour is possible only when, “Java” is referred by the String Pool table before Test1 class is loaded.
s1 refers to String object “Java” on the HEAP and s1.intern() returns the reference of String Pool object (as “Java” is already referred by the String Pool).
That is why s1 == s2 returns false.

But when Test2 class is loaded, “Udayan” is NOT referred by the String Pool table.
s3 refers to String object “Udayan” on the HEAP and s3.intern() adds String object referred by s3 to String Pool and returns the same reference. This means s3 and s4 refer to the same object.
That is why s3 == s4 returns true.

If my observation is correct, then this means pool of Strings is NOT empty initially.
String pool initially contains “Java”, “java”, “Oracle” and other String objects.

Can anyone please confirm this?


Solution

  • It depends on what you consider as "initially".

    When the JVM starts, the String Pool is empty. However, when various basic JDK classes get loaded and initialized before your Test1 class is loaded, it's not surprising that some of them add Strings to the String Pool. "Java" must be one of these Strings.

    And there is nothing in the JLS preventing the developers of Java from introducing new String literals in the initialization of classes in newer JDK versions. Therefore the difference you noticed between JDK 7 and JDK 8 is not surprising.