Here is javadoc for String#intern:
/**
* Returns a canonical representation for the string object.
* <p>
* A pool of strings, initially empty, is maintained privately by the
* class {@code String}.
* <p>
* When the intern method is invoked, if the pool already contains a
* string equal to this {@code String} object as determined by
* the {@link #equals(Object)} method, then the string from the pool is
* returned. Otherwise, this {@code String} object is added to the
* pool and a reference to this {@code String} object is returned.
* <p>
* It follows that for any two strings {@code s} and {@code t},
* {@code s.intern() == t.intern()} is {@code true}
* if and only if {@code s.equals(t)} is {@code true}.
* <p>
* All literal strings and string-valued constant expressions are
* interned. String literals are defined in section 3.10.5 of the
* <cite>The Java™ Language Specification</cite>.
*
* @return a string that has the same contents as this string, but is
* guaranteed to be from a pool of unique strings.
*/
Lets say I have next code:
String ref1 = "ref";
String ref2 = ref1.intern();
At point of time when ref is initialised, does ref1 still in heap or not. I'm asking because if it is then interning string without removing original reference will double RSS memory used by java process.
If we consider your example, yes, ref1
is still in the heap, but because both ref1
and ref2
point to the same instance. You initialise ref1
with a string literal, and string literals are automatically interned as described here:
Moreover, a string literal always refers to the same instance of class String. This is because string literals - or, more generally, strings that are the values of constant expressions (§15.28) - are "interned" so as to share unique instances, using the method String.intern.
So, no double memory usage (if you don't consider the string being present in the separate memory area that holds the content of the class ConstantPool and all the class structure information).
To explain a bit more in details how interning actually works, see this example:
public class Intern{
public static void main(String... args){
String str1="TestStr";
String str2="TestStr";
System.out.println("1. "+(str1==str2));
String str3=str1.intern();
System.out.println("2. "+(str1==str3));
String str4=new String("TestStr");
System.out.println("3. "+(str1==str4));
String str5=str4.intern();
System.out.println("4. "+(str4==str5));
System.out.println("5. "+(str1==str5));
}
}
You'll get this output:
1. true
Strings loaded from the Constant Pool are automatically interned into the String Pool, the result is true both instances refer to the same interned object.
2. true
str3
refers to a string instance that was already interned.
3. false
str4
is a new instance, nothing to do with the previous ones.
4. false
The throwaway str4
instance does not point to the same object that is present since the beginning in the string pool.
5. true
str5
points to our interned string as expected.
It's important to note that before Java 7(Oracle implementation) interned strings were stored in PermGem (that since Java 8 does not exist anymore), but since that release they have been moved to the Heap. So, using an older release of the JVM peculiar memory issues could appear when using the interning feature massively.
For additional info on how interned Strings are managed in different releases, check this nice post.