Search code examples
c#multithreadingperformance.net-6.0highperformance

Create and fill an array with an enumeration from 1 to n in the fastest method possible


I've the following code:

int n = 150000;
int[] myArray = Enumerable.Range(1, n).ToArray();

So I want myArray contains an enumeration like 1,2,3,4,etc...
Of course the size of the array should be variable.

The thing is this is called millions of times inside a Parallel.For so I'm looking for a way to improve it as fast as it possible. Every iteration, n is different.

I just nugged the CommunityToolkit.HighPerformance in order to use some advantages from there, I'm wonder if I can use Span<T> to replace the above code, since I read this code:

var array = new byte[100];
var span = new Span<byte>(array);

span.Fill(255);

So I tried to do this:

var myArray = new int[n];
var span = new Span<int>(myArray );
span.Fill(/*nothing works here */);

So how can I populate that array with a serie of 1 to n?
I will accept another way instead using Fill or even Span<T>. The objective is made faster the whole process.


Solution

  • Here is a vectorized implementation of a FillIncremental method. The Vector<T> is a small container of values that a single CPU core can process in parallel. In my PC the Vector.IsHardwareAccelerated is true and the Vector<int>.Count is 8. Initially a Vector<int> is filled with the values from 1 to 8. Then in each step all these values are incremented by 8 with a single += operation, and the incremented vector is copied to the next section of the target array. Finally the last few slots in the array that have not been filled by the vector (because the array Length might not be divisible by 8), are filled with a simple for loop:

    /// <summary>
    /// Fills an array of integers with incremented values, starting from the 'startValue'.
    /// </summary>
    public static void FillIncremental(int[] array, int startValue = 0)
    {
        ArgumentNullException.ThrowIfNull(array);
        if (array.Length > 0 && startValue > (Int32.MaxValue - array.Length) + 1)
            throw new ArgumentOutOfRangeException(nameof(startValue));
    
        static void FillSimple(int[] array, int index, int length, int valueOffset)
        {
            int endIndex =  index + length;
            for (int i = index, j = index + valueOffset; i < endIndex; i++, j++)
                array[i] = j;
        }
    
        if (!Vector.IsHardwareAccelerated || array.Length < Vector<int>.Count)
        {
            FillSimple(array, 0, array.Length, startValue);
            return;
        }
        FillSimple(array, 0, Vector<int>.Count, startValue);
        Vector<int> vector = new(array);
        Vector<int> step = new(Vector<int>.Count);
        int endIndex = array.Length - Vector<int>.Count + 1;
        int i;
        for (i = Vector<int>.Count; i < endIndex; i += Vector<int>.Count)
        {
            vector += step;
            vector.CopyTo(array, i);
        }
        FillSimple(array, i, array.Length - i, startValue);
    }
    

    Usage example:

    int n = 150_000;
    int[] myArray = new int[n];
    FillIncremental(myArray, 1);
    

    In my PC the FillIncremental method is about 4 times faster than filling the array with a simple for loop (online benchmark).

    I am not overly familiar with vectors, so it might be possible to optimize further the above approach.


    Update: Enigmativity mentioned in the comments that the simple int[] myArray = Enumerable.Range(1, n).ToArray() is actually faster than the above vectorized implementation. My own benchmark confirms this observation. Currently I have no idea why the Enumerable.Range is so fast. According to the source code is should perform similarly to the FillSimple above, so it should be around 3 times slower (taking into account the constant time of instantiating the array). It's around 15% faster instead (on .NET 7, .NET 6 and .NET 5). 🤔