Search code examples
javagarbage-collectionjavac

Is the return expression always computed, or can it be optimized out by the compiler?


I know that in this code:

public static void main(String[] args) {myMethod();}
private Object myMethod() {
    Object o = new Object();
    return o;
}

the garbage collector will destroy o after the execution of myMethod because the return value of myMethod is not assigned, and therefore there are no references to it. But what if the code is something like:

public static void main(String[] args) {myMethod();}
private Object myMethod() {
    int i = 5;
    return i + 10;
}

Will the compiler even bother processing i + 10, seeing as the return value is not assigned?

And if i was not a simple primitive, but a larger object:

public static void main(String[] args) {myMethod();}
private Object myMethod() {
    return new LargeObject();
}

where LargeObject has an expensive constructor, will the compiler still allocate memory and call the constructor, in case it has any side effects?

This would be especially important if the return expression is complex, but has no side effects, such as:

public static void main(String[] args) {
    List<Integer> list = new LinkedList();
    getMiddle();
}
private Object getMiddle(List list) {
    return list.get((int) list(size) / 2);
}

Calling this method in real life without using the return value would be fairly pointless, but it's for the sake of example.

My question is: Given these examples (object constructor, operation on primitive, method call with no side effects), can the compiler skip the return statement of a method if it sees that the value won't be assigned to anything?

I know I could come up with many tests for these problems, but I don't know if I would trust them. My understanding of code optimization and GC are fairly basic, but I think I know enough to say that the treatment of specific bits of code aren't necessarily generalizable. This is why I'm asking.


Solution

  • First, lets deal with a misconception that is apparent in your question, and some of the comments.

    In a HotSpot (Oracle or OpenJDK) Java platform, there are actually two compilers that have to be considered:

    • The javac compiler translates Java source code to bytecodes. It does minimal optimization. In fact the only significant optimizations that it does are evaluation of compile-time-constant expressions (which is actually necessary for certain comile-time checks) and re-writing of String concatenation sequences.

      You can easily see what optimizations are done ... using javap ... but it is also misleading to because the heavy-duty optimization has not been done yet. Basically, the javap output is mostly unhelpful when it comes to optimization.

    • The JIT compiler does the heavy-weight optimization. It is invoked at runtime while your program is running.

      It is not invoked immediately. Typically your bytecodes are interpreted for the first few times that any method is called. The JVM is gathering behavioral stats that will be used by the JIT compiler to optimize (!).


    So, in your example, the main method is called once and myMethd is called once. The JIT compiler won't even run, so in fact the bytecodes will be interpreted. But that is cool. It would take orders of magnitude more time for the JIT compiler to optimize than you would save by running the optimizer.

    But supposing the optimizer did run ...

    The JIT code compiler generally has a couple strategies:

    • Within a method, it optimizes based on the information local to the method.
    • When a method is called, it looks to see if the called method can be inlined at the call site. After the inlining, the code can then be further optimized in its context.

    So here's what is likely to happen.

    • Then your myMethod() is optimized as a free standing method, the unnecessary statements will not be optimized away. Because they won't be unnecessary in all possible contexts.

    • When / if a method call to myMethod() is inlined (e.g. into the main(...) method, the optimizer will then determine that (for example) these statements

          int i = 5;
          return i + 10;
      

      are unnecessary in this context, and optimize it away.

    But bear in mind that JIT compiler are evolving all of the time. So predicting exactly what optimizations will occur, and when, is next to impossible. And probably fruitless.

    Advice:

    • It is worthwhile thinking about whether you are doing unnecessary calculations at the "gross" level. Choosing the correct algorithm or data structure is often critical.

    • At the fine grained level, it is generally not worth it. Let the JIT compiler deal with it.

      UNLESS you have clear evidence that you need to optimize (i.e. a benchmark that is objectively too slow), and clear evidence there is a performance bottleneck at a particular point (e.g. profiling results).