Why are INumber<T>.CreateX(int n) so slow compared to implicit conversion for floats and doubles?

I'm working on a Maths library, and I'd like to be able to convert it to use the new INumber<T> interfaces in System.Numerics. The methods in here are often on hot paths and so would be good if they can be as fast as possible, while also being pleasant to work with. I've been benchmarking them as I go, and noticed something that seems a little odd.

There's a few cases where we divide an array value by its index, due to using System.Numerics, this cannot be done via an implicit conversion like you would when the type is defined, the recommended method seems to be to use INumber<T>.Create{Checked|Saturating|Truncating}(int n)

The benchmark methods I'm running are fairly simple, like so:

public static double[] DivideByImplicit(double[] values)
{
    double[] result = new double[values.Length];
    for (int i = 1; i < result.Length; i++)
    {
        result[i] = values[i] / i;
    }

    return result;
}

// Overloads for other types (int, long, float, decimal)
// ...


public static T[] DivideByChecked<T>(T[] values) where T : INumber<T>
{
    T[] result = new T[values.Length];
    for (int i = 1; i < values.Length; i++)
    {
        result[i] = values[i] / T.CreateChecked(i);
    }

    return result;
}

// Same again but for Saturating/Truncating
// ...

For int, long and decimal, the results are within a few percentage points at most, but when it comes to float and double there's a much larger performance hit, ~25% for double but up to a full 100% increase in time for float:

Method	Categories	Count	Mean	Error	StdDev	Ratio	RatioSD
DivideByImplicitDecimal	decimal	1000	29,292.733 ns	146.0540 ns	136.6190 ns	1.00	0.00
DivideByCheckedDecimal	decimal	1000	28,884.550 ns	82.0562 ns	76.7554 ns	0.99	0.00
DivideBySaturatingDecimal	decimal	1000	29,948.023 ns	61.6380 ns	54.6404 ns	1.02	0.01
DivideByTruncatingDecimal	decimal	1000	30,367.067 ns	67.0287 ns	62.6987 ns	1.04	0.01

DivideByImplicitDouble	double	1000	1,067.421 ns	20.5178 ns	24.4250 ns	1.00	0.00
DivideByCheckedDouble	double	1000	1,342.562 ns	17.7660 ns	14.8354 ns	1.25	0.03
DivideBySaturatingDouble	double	1000	1,343.400 ns	16.4938 ns	13.7731 ns	1.25	0.04
DivideByTruncatingDouble	double	1000	1,394.057 ns	27.2936 ns	44.0740 ns	1.31	0.04

DivideByImplicitFloat	float	1000	636.576 ns	4.9312 ns	4.6127 ns	1.00	0.00
DivideByCheckedFloat	float	1000	1,282.109 ns	13.9466 ns	12.3633 ns	2.01	0.02
DivideBySaturatingFloat	float	1000	1,288.927 ns	10.4365 ns	8.7149 ns	2.03	0.02
DivideByTruncatingFloat	float	1000	1,291.335 ns	20.6446 ns	17.2392 ns	2.03	0.04

DivideByImplicitInt	int	1000	1,210.849 ns	13.0819 ns	12.2368 ns	1.00	0.00
DivideByCheckedInt	int	1000	1,202.773 ns	7.3879 ns	6.5492 ns	0.99	0.01
DivideBySaturatingInt	int	1000	1,201.430 ns	8.0413 ns	6.7149 ns	0.99	0.01
DivideByTruncatingInt	int	1000	1,199.158 ns	9.0592 ns	7.0728 ns	0.99	0.01

DivideByImplicitLong	long	1000	1,712.002 ns	19.3905 ns	17.1891 ns	1.00	0.00
DivideByCheckedLong	long	1000	1,716.886 ns	17.9844 ns	16.8226 ns	1.00	0.01
DivideBySaturatingLong	long	1000	1,712.213 ns	31.2334 ns	29.2157 ns	1.00	0.02
DivideByTruncatingLong	long	1000	1,771.120 ns	34.2546 ns	40.7777 ns	1.03	0.03

Why is the impact so much higher when converting to a float/double than it is to any other numeric type? Not shown for brevity, but I also ran benchmarks for arrays of size 1, 100 and 1_000_000, and the slowdown wasn't visible on 1 or 100, but was similar on 1_000_000.

Solution

CreateChecked can be slow if the type you want is not the one you are passing in. You could make the loop use another looping variable which is typed as T and use that for your division.

public static T[] DivideByChecked<T>(T[] values) where T : INumber<T>
{
    T[] result = new T[values.Length];
    var tI = T.MultiplicativeIdentity;
    for (int i = 1; i < values.Length; i++)
    {
        result[i] = values[i] / tI;
        tI++;
    }

    return result;
}