BenchmarkDotNet gives unexpected results

I was doing an investigation in calculation performance in int, float, double, decimal. And I am wondering in the results. First of all I was expecting that when we doing plus operations the winner will be int but the true is on the screenshot.

Benchmark results

below the code I am inspecting.


public class PerformanceTest
{
    [Benchmark]
    public void CalcDouble()
    {
        double firstDigit = 135.543d;
        double secondDigit = 145.1234;

        double result = firstDigit + secondDigit;

    }

    [Benchmark]
    public void CalcDecimal()
    {
        decimal firstDigit = 135.543m;
        decimal secondDigit = 145.1234m;

        decimal result = firstDigit + secondDigit;
    }

    [Benchmark]
    public void Calcfloat()
    {
        float firstDigit = 135.543f;
        float secondDigit = 145.1234f;

        float result = firstDigit + secondDigit;
    }

    [Benchmark]
    public void Calcint()
    {
        int firstDigit = 135;
        int secondDigit = 145;

        int result = firstDigit + secondDigit;
    }
}

Can anyone explain me what is going on? Tank you.

I am expecting to have Int as the winner but the winner is float.

Solution

Part 1. The problem with the benchmarks

Both C# compiler and the Just-In-Time (JIT) compiler are allowed to perform various optimizations on your code. The exact set of optimization depends on the specific versions of these compilers, but there are some basic code transformations that you should expect by default.

One of the optimizations in your example is known as constant folding; it is capable of condensing

double firstDigit = 135.543d;
double secondDigit = 145.1234;
double result = firstDigit + secondDigit;

double result = 280.6664d;

Another optimization is known as the dead code elimination. Since you do not use the results of your calculations in the benchmarks, C#/JIT compilers are able to eliminate this code completely. Therefore, effectively, you benchmark an empty method like this:

[Benchmark]
public void CalcDouble()
{
}

The only exception is CalcDecimal: since Decimal is a struct in C# (not a primitive type), the C#/Roslyn compilers are not smart enough to completely eliminate the calculations (for now; this can be improved in the future).

Both of these optimizations are discussed in detail in the context of .NET benchmarking in the book "Pro .NET Benchmarking" (page 65: Dead Code Elimination, page 69: Constant Folding). Both topics belong to the Chapter "Common Benchmarking Pitfalls" which contains more pitfalls that can distort the results of your benchmarks.

Part 2. BenchmarkDotNet Results

When you pasted your summary table, you cut the warning section below the table. I rerun your benchmarks, and here are the extended version of the results (that is presented in BenchmarkDotNet results by default):

|      Method |      Mean |     Error |    StdDev |    Median |
|------------ |----------:|----------:|----------:|----------:|
|  CalcDouble | 0.0006 ns | 0.0023 ns | 0.0022 ns | 0.0000 ns |
| CalcDecimal | 3.0367 ns | 0.0527 ns | 0.0493 ns | 3.0135 ns |
|   Calcfloat | 0.0026 ns | 0.0023 ns | 0.0021 ns | 0.0000 ns |
|     Calcint | 0.0004 ns | 0.0010 ns | 0.0009 ns | 0.0000 ns |

// * Warnings *
ZeroMeasurement
  PerformanceTest.CalcDouble: Default -> The method duration is indistinguishable from the empty method duration
  PerformanceTest.Calcfloat: Default  -> The method duration is indistinguishable from the empty method duration
  PerformanceTest.Calcint: Default    -> The method duration is indistinguishable from the empty method duration

These warnings provide valuable insight into the results. Effectively, CalcDouble, Calcfloat, and Calcint take the same amount of time as an empty method like

public void Empty() { }

The numbers you see in the Mean column are just random CPU noise that is below the duration of one CPU cycle. Let's say the frequency of your CPU is 5GHz. It implies that the duration of a single CPU cycle is about 0.2ns. Nothing can be performed faster than one CPU cycle (if we talk about the latency of an operation that is measured by default in BenchmarkDotNet; if we switch to throughput measurements, we can get "faster" calculations that to various effects like instruction level parallelism, see "Pro .NET Benchmarking", page 440). The "Mean" value for CalcDouble, Calcfloat, and Calcint is significantly less than the duration of a single CPU cycle, so it doesn't make sense to actually compare them.

BenchmarkDotNet understands that something is wrong with the Mean column. So, in addition to warnings below the summary table, it adds a bonus column Median (which is hidden by default) to highlight the zero duration or emptiness of the discussed benchmarks.

Part 3. Possible benchmark design improvements

The best way to design such a benchmark is to make it similar to an actual real-life workload that is considered. The actual performance of arithmetic operations is an extremely tricky thing to measure; it depends on dozens of external factors (like the instruction level parallelism that I mentioned earlier). For details, see Chapter 7 "CPU-bound benchmarks" of "Pro .NET Benchmarking"; it has 24 case studies that provide various examples. Evaluating the "pure" duration of an arithmetic operation is an interesting technical challenge, but it can not be applicable to real-life code.

Here is also a few recommended BenchmarkDotNet tricks to design better benchmarks:

Move all the "constant" variables to public fields/properties. In this case, C#/JIT compiler will not be able to apply constant folding (because it doesn't know in advance that nobody is going to actually change the values of these public fields/properties).
Return the result of your calculations from the [Benchmark] method. This is the way to ask BenchmarkDotNet to prevent dead code elimination.

The suggested approach by Marc Gravell will work to some extent. However, it may also have some other issues:

Using the constant number of iterations in the for loop is not recommended since different JIT compilers may apply loop unrolling differently (another benchmarking pitfall, see "Pro .NET Benchmarking", page 61), which may distort the results in some environments.
Be aware of the fact that adding an artificial loop adds some performance costs to your benchmarks. So, it's OK to use such a set of benchmarks for getting relative results, but the absolute numbers will also include the loop overhead ("Pro .NET Benchmarking", page 54).
To the best of my knowledge, the modern C#/JIT compilers are not smart enough to completely eliminate such code. However, we have no guarantees that it will not be eliminated since it effectively returns the same constant all the time. Future versions of compilers can be smart enough to perform such optimizations (I believe that some Java runtimes are capable of eliminating similar benchmarks). So, it's better to move all the constants to public non-constant fields/properties in order to prevent such situations.