I am trying to microbenchmark a KMeans program. I am focussing on the Euclidean distance at the moment. I thought that (due to the square root, below) increasing the number of decimal places of each coordinate (x, y) would cause computation time to increase.
Here is how I calculate the Euclidean distance:
Math.sqrt((x - otherPoint.x) * (x - otherPoint.x) + (y - otherPoint.y) * (y - otherPoint.y))
Here are my results from microbenchmarking:
Benchmark (noOfFloatingPoints) (noOfPoints) Mode Cnt Score Error Units
TheBenchmark.toyBenchmark 16 5000 avgt 5 251214.457 ± 40224.490 ns/op
TheBenchmark.toyBenchmark 8 5000 avgt 5 319809.483 ± 560434.712 ns/op
TheBenchmark.toyBenchmark 2 5000 avgt 5 477652.450 ± 1068570.972 ns/op
As you can see, the score actually increases as the number of decimal places decreases! I have tried this on 5000 points, however it remains the same no matter how little or many points I use.
Why is this the case? I thought that the more floating points, the more computation would be required, especially due to the square root.
To increase the number of decimal places I have created this function:
public static double generateRandomToDecimalPlace(Random rnd,
int lowerBound,
int upperBound,
int decimalPlaces) {
final double dbl = (rnd.nextDouble() * (upperBound - lowerBound)) + lowerBound;
return roundAvoid(dbl, decimalPlaces);
}
public static double roundAvoid(double value, int places) {
double scale = Math.pow(10, places);
return Math.round(value * scale) / scale;
}
I am randomly generating points between a certain range (-100 to 100) and specific number of decimal points:
@Param({"16", "8", "2"})
public int noOfFloatingPoints;
The type double
is a binary, fixed-length data type. It always uses 64 bits to represent a value, no matter how many decimal points your number has. Furthermore, since it is coded in binary, it doesn't even use decimal points. It uses floating points using base-2 arithmetic.