Search code examples
javaperformancejvm-hotspot

Returning two values from Java function efficiently


Does anybody know if there is a way to return two values from Java with (close to) zero overhead? I'm only looking for two values - I have a couple use cases from processing an array of bytes (and need the return value and the next starting position) to trying to return a value with an error code to doing some ugliness with fixed-point calculations and need the whole and fractional part.

I'm not below some really ugly hacks. The function is small and Hotspot happily inlines it. So now, I just need to get Hotspot to basically elide any object creation or bit shifting.

If I restrict my returned values to ints, I 've tried to pack them into a long, but even after inlining, Hotspot cannot seem figure out that all the bit shifts and masks don't really do anything and it happily packs and unpacks the ints into the same values (clearly, a place where Hotspot's peephole optimizer needs help). But at least I'm not creating an object.

My more difficult case is when one of the items I need to return is a reference and the other is a long or another reference (for the int case, I think I can compress the OOP and use the bit packing described above).

Has anybody tried to get Hotspot to generate garbage-free code for this? Worst case right now is that I have to have a carry around an object and pass it in, but I'd like to keep it self contained. Thread Locals are expensive (hash lookups), and it needs to be reentrant.


Solution

  • -XX:+EliminateAllocations optimization (ON by default in Java 8) works fine for that.

    Whenever you return new Pair(a, b) right at the end of the callee method and use the result immediately in the caller, JVM is very likely to do a scalar replacement if the callee is inlined.

    A simple experiment shows there's nearly no overhead in returning an object. This is not only an efficient way, but also the most readable one.

    Benchmark                        Mode  Cnt    Score   Error   Units
    ReturnPair.manualInline         thrpt   30  127,713 ± 3,408  ops/us
    ReturnPair.packToLong           thrpt   30  113,606 ± 1,807  ops/us
    ReturnPair.pairObject           thrpt   30  126,881 ± 0,478  ops/us
    ReturnPair.pairObjectAllocated  thrpt   30   92,477 ± 0,621  ops/us
    

    The benchmark:

    import org.openjdk.jmh.annotations.*;
    import org.openjdk.jmh.infra.Blackhole;
    
    import java.util.concurrent.ThreadLocalRandom;
    
    @State(Scope.Benchmark)
    public class ReturnPair {
        int counter;
    
        @Benchmark
        public void manualInline(Blackhole bh) {
            bh.consume(counter++);
            bh.consume(ThreadLocalRandom.current().nextInt());
        }
    
        @Benchmark
        public void packToLong(Blackhole bh) {
            long packed = getPacked();
            bh.consume((int) (packed >>> 32));
            bh.consume((int) packed);
        }
    
        @Benchmark
        public void pairObject(Blackhole bh) {
            Pair pair = getPair();
            bh.consume(pair.a);
            bh.consume(pair.b);
        }
    
        @Benchmark
        @Fork(jvmArgs = "-XX:-EliminateAllocations")
        public void pairObjectAllocated(Blackhole bh) {
            Pair pair = getPair();
            bh.consume(pair.a);
            bh.consume(pair.b);
        }
    
        public long getPacked() {
            int a = counter++;
            int b = ThreadLocalRandom.current().nextInt();
            return (long) a << 32 | (b & 0xffffffffL);
        }
    
        public Pair getPair() {
            int a = counter++;
            int b = ThreadLocalRandom.current().nextInt();
            return new Pair(a, b);
        }
    
        static class Pair {
            final int a;
            final int b;
    
            Pair(int a, int b) {
                this.a = a;
                this.b = b;
            }
        }
    }