Search code examples
javaperformancedivisioninteger-division

Fast integer division in Java


It is well known that integer division is slow operation (typically several times slower than integer multiplication). But, if one need to perform many divide operations with a fixed divisor, it is possible to do some preconditioning on the divisor and replace "/" with multiplication and bit operations (Chapter 10 in Hacker's Delight).

As I've tested, if the divisor is a compilation-time constant, (e.g. static final long DIVISOR = 12345L;) JVM will do the trick and replace all divisions by DIVISOR with multiplication and bit operations. I'm interesting in the same kind of trick but when the divisor is known only at runtime.

For example, the following (slow) method:

void reduceArraySlow(long[] data, long denominator){
    for(int i = 0; i < data.length; ++i)
        data[i] = data[i] / denominator;
}

can be replaced with something:

void reduceArrayFast(long[] data, long denominator){
    SomeMagicStructure magic = computeMagic(denominator);
    for(int i = 0; i < data.length; ++i)
         // computes data[i] / denominator
        data[i] = doFastDivision(data[i], magic);
}

which must do the job much faster, since all / operations are replaced with faster operations (and also because division is not pipelined in CPUs).


Solution

  • There is well known C/C++ libdivide library for fast integer division, and there is my adaptation of this library for Java libdivide4j.

    Fast division with libdivide4j looks as follows:

    void reduceArrayFast(long[] data, long denominator){
        FastDivision.Magic magic = FastDivision.magicSigned(denominator);
        for(int i = 0; i < data.length; ++i)
            // computes data[i] / denominator
            data[i] = FastDivision.divideSignedFast(data[i], magic);
    }
    

    A simple benchmark

    public void benchmark() throws Exception {
        Random rnd = new Random();
        int nIterations = 10000;
        //let the JIT to optimize something
        for (int att = 0; att < nIterations; att++) {
            long[] data = new long[1000];
            for (int i = 0; i < data.length; i++)
                data[i] = rnd.nextLong();
    
            long denominator = rnd.nextLong();
    
            long[] slow = data.clone();
            long start = System.nanoTime();
            reduceArraySlow(slow, denominator);
            long slowTime = System.nanoTime() - start;
    
    
            long[] fast = data.clone();
            start = System.nanoTime();
            reduceArrayFast(fast, denominator);
            long fastTime = System.nanoTime() - start;
    
            Assert.assertArrayEquals(slow, fast);
    
            // print last 100 timings (JVM already warmed up)
            if (att > nIterations - 100) {
                System.out.println("\"/\" operation: " + slowTime);
                System.out.println("Fast division: " + fastTime);
                System.out.println("");
            }
        }
    }
    

    shows the following timings (nanoseconds) for ordinary / and fast division (Core i7, jdk8 64-bit) :

    "/" operation: 13233
    Fast division: 5957
    
    "/" operation: 13148
    Fast division: 5103
    
    "/" operation: 13587
    Fast division: 6188
    
    "/" operation: 14173
    Fast division: 6773
    ...