In a Java REST service performance test, I got an unexpected pattern: a method that creates and returns always the same value object in each invocation runs faster than another version that just returns the value object stored in a class or object field.
Code:
@POST @Path("inline") public Response inline(String s) {
return Response.status(Status.CREATED).build();
}
private static final Response RESP = Response.status(Status.CREATED).build();
@POST @Path("staticfield") public Response static(String s) {
return RESP;
}
private final Response resp = Response.status(Status.CREATED).build();
@POST @Path("field") public Response field(String s) {
return resp;
}
Byte code:
Performance (using Apache AB, single thread, several runs with consistent results):
Environment: RHEL6 + JDK Oracle 1.7.0_60-b19 64bits
Is is possible that the JVM optimized the inline version with native code, but never considered optimizing the other two because they are already pretty small?
As pointed out in the comments, it is difficult to tell without actually looking at the assembly. As yoy are using a REST-framework, I assume however that is would be hard to tell from the assembly as there is quite a lot of code to read.
Instead, I want to give you an educated guess because your code is an archetypical example of applying costant folding. When a value is inlined and not read from a field, the JVM can safely assume that this value is constant. When JIT compiling the method, the constant expression can therefore be safely merged with your framework code what probably leads to less JIT assebly and therefore improved performance. For a field value, even a final
one, a constant value cannot be assumed as the field value can change. (As long as the field value is not a compile time constant, a primitive or a constant String
, which are inlined by javac.) The JVM can therefore probably not constant fold the value.
You can read more on constant folding in the tutorial to the JMH where it is noted:
If JVM realizes the result of the computation is the same no matter what, it can cleverly optimize it. In our case, that means we can move the computation outside of the internal JMH loop. This can be prevented by always reading the inputs from the state, computing the result based on that state, and the follow the rules to prevent DCE.
I hope you used such a framework. Otherwise, you performance metric is unlikely to be valid.
From reading the byte code, you can generally not learn much about runtime performance as the JIT compiler can tweak the byte code to anything during optimization. The byte code layout should only matter when code is interpreted which is generally not the state where one would measure performance as performance-critical, hot code is always JIT compiled.