java performance-testing floating-point-conversion microbenchmark jmh

Random data with JMH Java microbenchmark testing floating point printing

I'm writing a JMH microbenchmark for floating point printing code I wrote. I'm not overly concerned about the exact performance yet, but getting the benchmark code correct.

I want to loop over some randomly generate data, so I make some static arrays of data and keep my loop machinery (increment and mask) as simple as possible. Is this the correct way or should I be telling JMH a little more about what is going on with some annotations I'm missing?

Also, is it possible to make display groups for the test instead of just lexicographic order? I basically have two groups of test (one group for each set of random data.

The full source is at https://github.com/jnordwick/zerog-grisu

Here is the benchmark code:

package zerog.util.grisu;

import java.util.Random;

import org.openjdk.jmh.annotations.Benchmark;
import org.openjdk.jmh.runner.Runner;
import org.openjdk.jmh.runner.RunnerException;
import org.openjdk.jmh.runner.options.Options;
import org.openjdk.jmh.runner.options.OptionsBuilder;

/* 
 * Current JMH bench, similar on small numbers (no fast path code yet)
 * and 40% faster on completely random numbers.
 * 
 * Benchmark                         Mode  Cnt         Score         Error  Units
 * JmhBenchmark.test_lowp_doubleto  thrpt   20  11439027.798 ± 2677191.952  ops/s
 * JmhBenchmark.test_lowp_grisubuf  thrpt   20  11540289.271 ±  237842.768  ops/s
 * JmhBenchmark.test_lowp_grisustr  thrpt   20   5038077.637 ±  754272.267  ops/s
 * 
 * JmhBenchmark.test_rand_doubleto  thrpt   20   1841031.602 ±  219147.330  ops/s
 * JmhBenchmark.test_rand_grisubuf  thrpt   20   2609354.822 ±   57551.153  ops/s
 * JmhBenchmark.test_rand_grisustr  thrpt   20   2078684.828 ±  298474.218  ops/s
 * 
 * This doens't account for any garbage costs either since the benchmarks
 * aren't generating enough to trigger GC, and Java internally uses per-thread
 * objects to avoid some allocations.
 * 
 * Don't call Grisu.doubleToString() except for testing. I think the extra
 * allocations and copying are killing it. I'll fix that.
 */

public class JmhBenchmark {

    static final int nmask = 1024*1024 - 1;
    static final double[] random_values = new double[nmask + 1];
    static final double[] lowp_values = new double[nmask + 1];

    static final byte[] buffer = new byte[30];
    static final byte[] bresults = new byte[30];

    static int i = 0;
    static final Grisu g = Grisu.fmt;

    static {

        Random r = new Random();
        int[] pows = new int[] { 1, 10, 100, 1000, 10000, 100000, 1000000 };

        for( int i = 0; i < random_values.length; ++i ) {
            random_values[i] = r.nextDouble();
        }

        for(int i = 0; i < lowp_values.length; ++i ) {
            lowp_values[i] = (1 + r.nextInt( 10000 )) / pows[r.nextInt( pows.length )];
        }
    }

    @Benchmark
    public String test_rand_doubleto() {
        String s = Double.toString( random_values[i] );
        i = (i + 1) & nmask;
        return s;
    }

    @Benchmark
    public String test_lowp_doubleto() {
        String s = Double.toString( lowp_values[i] );
        i = (i + 1) & nmask;
        return s;
    }

    @Benchmark
    public String test_rand_grisustr() {
        String s =  g.doubleToString( random_values[i] );
        i = (i + 1) & nmask;
        return s;
    }

    @Benchmark
    public String test_lowp_grisustr() {
        String s =  g.doubleToString( lowp_values[i] );
        i = (i + 1) & nmask;
        return s;
    }

    @Benchmark
    public byte[] test_rand_grisubuf() {
        g.doubleToBytes( bresults, 0, random_values[i] );
        i = (i + 1) & nmask;
        return bresults;
    }

    @Benchmark
    public byte[] test_lowp_grisubuf() {
        g.doubleToBytes( bresults, 0, lowp_values[i] );
        i = (i + 1) & nmask;
        return bresults;
    }

    public static void main(String[] args) throws RunnerException {
        Options opt = new OptionsBuilder()
                .include(".*" + JmhBenchmark.class.getSimpleName() + ".*")
                .warmupIterations(20)
                .measurementIterations(20)
                .forks(1)
                .build();

        new Runner(opt).run();
    }
}

Solution

You can only prove the benchmark is correct by analyzing its results. The benchmark code can only raise the red flags that you have to follow up on. I see these red flags in your code:

Reliance on static final fields to store the state. The contents of these fields can be routinely "inlined" into the computation, rendering parts of your benchmark futile. JMH only saves you from constant-folding the regular fields from @State objects.
Using static initializers. While this has no repercussions in current JMH, the expected way is to use @Setup methods to initialize state. For your case, it also helps to get truly random data points, e.g. if you set @Setup(Level.Iteration) to reinitialize the values before starting the next iteration of the test.

As far as the general approach is concerned, this is one of the ways to achieve safe looping: putting the loop counter outside the method. There is another arguably safe one: loop over the array in the method, but sink every iteration result into Blackhole.consume.