I was doing an investigation in calculation performance in int, float, double, decimal. And I am wondering in the results. First of all I was expecting that when we doing plus operations the winner will be int but the true is on the screenshot.
below the code I am inspecting.
public class PerformanceTest
{
[Benchmark]
public void CalcDouble()
{
double firstDigit = 135.543d;
double secondDigit = 145.1234;
double result = firstDigit + secondDigit;
}
[Benchmark]
public void CalcDecimal()
{
decimal firstDigit = 135.543m;
decimal secondDigit = 145.1234m;
decimal result = firstDigit + secondDigit;
}
[Benchmark]
public void Calcfloat()
{
float firstDigit = 135.543f;
float secondDigit = 145.1234f;
float result = firstDigit + secondDigit;
}
[Benchmark]
public void Calcint()
{
int firstDigit = 135;
int secondDigit = 145;
int result = firstDigit + secondDigit;
}
}
Can anyone explain me what is going on? Tank you.
I am expecting to have Int as the winner but the winner is float.
Both C# compiler and the Just-In-Time (JIT) compiler are allowed to perform various optimizations on your code. The exact set of optimization depends on the specific versions of these compilers, but there are some basic code transformations that you should expect by default.
One of the optimizations in your example is known as constant folding; it is capable of condensing
double firstDigit = 135.543d;
double secondDigit = 145.1234;
double result = firstDigit + secondDigit;
to
double result = 280.6664d;
Another optimization is known as the dead code elimination. Since you do not use the results of your calculations in the benchmarks, C#/JIT compilers are able to eliminate this code completely. Therefore, effectively, you benchmark an empty method like this:
[Benchmark]
public void CalcDouble()
{
}
The only exception is CalcDecimal: since Decimal is a struct in C# (not a primitive type), the C#/Roslyn compilers are not smart enough to completely eliminate the calculations (for now; this can be improved in the future).
Both of these optimizations are discussed in detail in the context of .NET benchmarking in the book "Pro .NET Benchmarking" (page 65: Dead Code Elimination, page 69: Constant Folding). Both topics belong to the Chapter "Common Benchmarking Pitfalls" which contains more pitfalls that can distort the results of your benchmarks.
When you pasted your summary table, you cut the warning section below the table. I rerun your benchmarks, and here are the extended version of the results (that is presented in BenchmarkDotNet results by default):
| Method | Mean | Error | StdDev | Median |
|------------ |----------:|----------:|----------:|----------:|
| CalcDouble | 0.0006 ns | 0.0023 ns | 0.0022 ns | 0.0000 ns |
| CalcDecimal | 3.0367 ns | 0.0527 ns | 0.0493 ns | 3.0135 ns |
| Calcfloat | 0.0026 ns | 0.0023 ns | 0.0021 ns | 0.0000 ns |
| Calcint | 0.0004 ns | 0.0010 ns | 0.0009 ns | 0.0000 ns |
// * Warnings *
ZeroMeasurement
PerformanceTest.CalcDouble: Default -> The method duration is indistinguishable from the empty method duration
PerformanceTest.Calcfloat: Default -> The method duration is indistinguishable from the empty method duration
PerformanceTest.Calcint: Default -> The method duration is indistinguishable from the empty method duration
These warnings provide valuable insight into the results. Effectively, CalcDouble
, Calcfloat
, and Calcint
take the same amount of time as an empty method like
public void Empty() { }
The numbers you see in the Mean
column are just random CPU noise that is below the duration of one CPU cycle. Let's say the frequency of your CPU is 5GHz. It implies that the duration of a single CPU cycle is about 0.2ns. Nothing can be performed faster than one CPU cycle (if we talk about the latency of an operation that is measured by default in BenchmarkDotNet; if we switch to throughput measurements, we can get "faster" calculations that to various effects like instruction level parallelism, see "Pro .NET Benchmarking", page 440). The "Mean" value for CalcDouble
, Calcfloat
, and Calcint
is significantly less than the duration of a single CPU cycle, so it doesn't make sense to actually compare them.
BenchmarkDotNet understands that something is wrong with the Mean
column. So, in addition to warnings below the summary table, it adds a bonus column Median
(which is hidden by default) to highlight the zero duration or emptiness of the discussed benchmarks.
The best way to design such a benchmark is to make it similar to an actual real-life workload that is considered. The actual performance of arithmetic operations is an extremely tricky thing to measure; it depends on dozens of external factors (like the instruction level parallelism that I mentioned earlier). For details, see Chapter 7 "CPU-bound benchmarks" of "Pro .NET Benchmarking"; it has 24 case studies that provide various examples. Evaluating the "pure" duration of an arithmetic operation is an interesting technical challenge, but it can not be applicable to real-life code.
Here is also a few recommended BenchmarkDotNet tricks to design better benchmarks:
[Benchmark]
method. This is the way to ask BenchmarkDotNet to prevent dead code elimination.The suggested approach by Marc Gravell will work to some extent. However, it may also have some other issues: