WE have the behaviour that Java compiler will use the same instance if use a string constant
String a = "abc";
String b = "abc";
//a == b
String c = new String("abc");
// c is a brand new object on the heap;
Why doesn't java compiler optimize out the new String and substitute it with the equivalent assignment ? Were there some deep design decisions or it is just a coincidence? Can we expect a different JVM or compiler to be more aggressive and actually replace the heap instances of immutable objects with a well-known static ones ? While the String is the most notorious example, we could have the same behaviour for Integer , for example.
First of all, the String(String)
"copy" constructor stems from the initial days and is an anomaly. Maybe because of String.intern()
which does a bit of copy prevention, as are the constants "..."
. It is never needed, as String is an immutable final
class.
For Integer
there is Integer.valueOf(int)
that uses a cache of instants which per default holds -128 upto 127.
Despite the very competent compiler development team involved, the java byte code compiler compiles very naive. But then, on byte code to machine code, some nice things may happen. For instance object not created as such on the heap, but on the stack.
Simplistic compilation at least is less likely to contain errors in the dataflow analysis of a smart trick. (It also provides a good reason for good code style.)
An example:
List<String> list = ...
String[] array1 = list.toArray(new String[0]);
String[] array2 = list.toArray(new String[list.size()]);
toArray
needs an actual array instance, as because of type erasure the List list
no longer knows it contains String
s.
Historically as optimization one could pass an array of fitting size (here the version with list.size()
) which would then be returned. More optimal and faster, and still some style checker mark the first version. However actually the first version is faster as an other array byte cdoe instantiation is used, and array1 will be fractionally faster generated.
The same story on division by some numbers. In C there are many compiler optimisations involving faster shifts. This is (partly) done in Java in the byte code to machine code compilation, a more logical place for these optimisations.
I personally think an optimizing byte code compiler would be nice, maybe something for university projects. However it might not be justifiable just for code improvements, like not using .equals
for enum values.