Search code examples
javaperformancejavacbytecodejavap

What are the differences of explicit and implicit instantiating String class in java


I have been told that creating String instance like this

String s = new String("Don't do this"); // explicit

has a performance problem since it creates two instance of string on for double quoted phrase "Don't do this" and one for the new String() constructor!

today i had the time to test it by my self I created two classes:

public class String1 {
public static void main(String[] args) {
    String s = new String("Hello");
    System.out.println(s);
}
}

public class String2 {
public static void main(String[] args) {
    String s = "Hello";
    System.out.println(s);
}
}

here is the output of javap:

C:\jav>javap String1
Compiled from "String1.java"
public class String1 extends java.lang.Object{
    public String1();
    public static void main(java.lang.String[]);
}

C:\jav>javap String2
Compiled from "String2.java"
public class String2 extends java.lang.Object{
    public String2();
    public static void main(java.lang.String[]);
}

seems they are same however with the -c flag the outputs are deferent.

C:\jav>javap -c String1
Compiled from "String1.java"
public class String1 extends java.lang.Object{
public String1();
  Code:
  0:   aload_0
  1:   invokespecial   #1; //Method java/lang/Object."<init>":()V
  4:   return

public static void main(java.lang.String[]);
  Code:
  0:   new     #2; //class java/lang/String
  3:   dup
  4:   ldc     #3; //String Hello
  6:   invokespecial   #4; //Method java/lang/String."<init>":(Ljava/lang/String;)V
  9:   astore_1
  10:  getstatic       #5; //Field java/lang/System.out:Ljava/io/PrintStream;
  13:  aload_1
  14:  invokevirtual   #6; //Method java/io/PrintStream.println:(Ljava/lang/String;)V
  17:  return

}


C:\jav>javap -c String2
Compiled from "String2.java"
public class String2 extends java.lang.Object{
public String2();
  Code:
  0:   aload_0
  1:   invokespecial   #1; //Method java/lang/Object."<init>":()V
  4:   return

public static void main(java.lang.String[]);
  Code:
  0:   ldc     #2; //String Hello
  2:   astore_1
  3:   getstatic       #3; //Field java/lang/System.out:Ljava/io/PrintStream;
  6:   aload_1
  7:   invokevirtual   #4; //Method java/io/PrintStream.println:(Ljava/lang/String;)V
  10:  return

}

so here is my questions :) first what is "ldc", astore_1 etc ? are there any documentation describing those? second does javac really can't figure out these two sentences are equal??


Solution

  • Wikipedia has a very convenient summary of all the possible Java Bytecode instructions. Also, to get the full picture, it's better to use javap -v, to see the entire content of the file, including the constant pool:

    Classfile /.../String1.class
      Last modified 02/05/2013; size 458 bytes
      MD5 checksum e3c355bf648c7441784ffc6b9765ba4d
      Compiled from "String1.java"
    public class String1
      SourceFile: "String1.java"
      minor version: 0
      major version: 51
      flags: ACC_PUBLIC, ACC_SUPER
    Constant pool:
       #1 = Methodref          #8.#17         //  java/lang/Object."<init>":()V
       #2 = Class              #18            //  java/lang/String
       #3 = String             #19            //  Hello
       #4 = Methodref          #2.#20         //  java/lang/String."<init>":(Ljava/l
    ang/String;)V
       #5 = Fieldref           #21.#22        //  java/lang/System.out:Ljava/io/Prin
    tStream;
       #6 = Methodref          #23.#24        //  java/io/PrintStream.println:(Ljava
    /lang/String;)V
       #7 = Class              #25            //  String1
       #8 = Class              #26            //  java/lang/Object
       #9 = Utf8               <init>
      #10 = Utf8               ()V
      #11 = Utf8               Code
      #12 = Utf8               LineNumberTable
      #13 = Utf8               main
      #14 = Utf8               ([Ljava/lang/String;)V
      #15 = Utf8               SourceFile
      #16 = Utf8               String1.java
      #17 = NameAndType        #9:#10         //  "<init>":()V
      #18 = Utf8               java/lang/String
      #19 = Utf8               Hello
      #20 = NameAndType        #9:#27         //  "<init>":(Ljava/lang/String;)V
      #21 = Class              #28            //  java/lang/System
      #22 = NameAndType        #29:#30        //  out:Ljava/io/PrintStream;
      #23 = Class              #31            //  java/io/PrintStream
      #24 = NameAndType        #32:#27        //  println:(Ljava/lang/String;)V
      #25 = Utf8               String1
      #26 = Utf8               java/lang/Object
      #27 = Utf8               (Ljava/lang/String;)V
      #28 = Utf8               java/lang/System
      #29 = Utf8               out
      #30 = Utf8               Ljava/io/PrintStream;
      #31 = Utf8               java/io/PrintStream
      #32 = Utf8               println
    {
      public String1();
        flags: ACC_PUBLIC
        Code:
          stack=1, locals=1, args_size=1
             0: aload_0
             1: invokespecial #1                  // Method java/lang/Object."<init>
    ":()V
             4: return
          LineNumberTable:
            line 1: 0
    
      public static void main(java.lang.String[]);
        flags: ACC_PUBLIC, ACC_STATIC
        Code:
          stack=3, locals=2, args_size=1
             0: new           #2                  // class java/lang/String
             3: dup
             4: ldc           #3                  // String Hello
             6: invokespecial #4                  // Method java/lang/String."<init>
    ":(Ljava/lang/String;)V
             9: astore_1
            10: getstatic     #5                  // Field java/lang/System.out:Ljav
    a/io/PrintStream;
            13: aload_1
            14: invokevirtual #6                  // Method java/io/PrintStream.prin
    tln:(Ljava/lang/String;)V
            17: return
          LineNumberTable:
            line 3: 0
            line 4: 10
            line 5: 17
    }
    

    And now it's clear from where ldc loads the constant.

    Regarding your question about why javac doesn't bother with these optimizations - it's mostly because almost the entire optimization done on Java is deferred to runtime, where a different compiler runs: the JIT compiler, which compiles Java Bytecode to native machine code. javac does make some effort to optimize the "common" cases, but it's far from the aggressiveness of the jitter.