Search code examples
javak-meanseuclidean-distancemicrobenchmarkjmh

Why does increasing the number of decimal places not affect the computation time of Euclidean distance?


I am trying to microbenchmark a KMeans program. I am focussing on the Euclidean distance at the moment. I thought that (due to the square root, below) increasing the number of decimal places of each coordinate (x, y) would cause computation time to increase.

Here is how I calculate the Euclidean distance:

Math.sqrt((x - otherPoint.x) * (x - otherPoint.x) + (y - otherPoint.y) * (y - otherPoint.y))

Here are my results from microbenchmarking:

Benchmark                 (noOfFloatingPoints)  (noOfPoints)  Mode  Cnt       Score         Error  Units
TheBenchmark.toyBenchmark                    16          5000  avgt    5  251214.457 ±   40224.490  ns/op
TheBenchmark.toyBenchmark                     8          5000  avgt    5  319809.483 ±  560434.712  ns/op
TheBenchmark.toyBenchmark                     2          5000  avgt    5  477652.450 ± 1068570.972  ns/op

As you can see, the score actually increases as the number of decimal places decreases! I have tried this on 5000 points, however it remains the same no matter how little or many points I use.

Why is this the case? I thought that the more floating points, the more computation would be required, especially due to the square root.

To increase the number of decimal places I have created this function:

public static double generateRandomToDecimalPlace(Random rnd,
                                                          int lowerBound,
                                                          int upperBound,
                                                          int decimalPlaces) {
            final double dbl = (rnd.nextDouble() * (upperBound - lowerBound)) + lowerBound;
            return roundAvoid(dbl, decimalPlaces);
        }

    public static double roundAvoid(double value, int places) {
        double scale = Math.pow(10, places);
        return Math.round(value * scale) / scale;
    }

I am randomly generating points between a certain range (-100 to 100) and specific number of decimal points:

    @Param({"16", "8", "2"})
    public int noOfFloatingPoints;

Solution

  • The type double is a binary, fixed-length data type. It always uses 64 bits to represent a value, no matter how many decimal points your number has. Furthermore, since it is coded in binary, it doesn't even use decimal points. It uses floating points using base-2 arithmetic.