I've the following code:
int n = 150000;
int[] myArray = Enumerable.Range(1, n).ToArray();
So I want myArray
contains an enumeration like 1,2,3,4,etc...
Of course the size of the array should be variable.
The thing is this is called millions of times inside a Parallel.For
so I'm looking for a way to improve it as fast as it possible. Every iteration, n
is different.
I just nugged the CommunityToolkit.HighPerformance
in order to use some advantages from there, I'm wonder if I can use Span<T>
to replace the above code, since I read this code:
var array = new byte[100];
var span = new Span<byte>(array);
span.Fill(255);
So I tried to do this:
var myArray = new int[n];
var span = new Span<int>(myArray );
span.Fill(/*nothing works here */);
So how can I populate that array with a serie of 1 to n?
I will accept another way instead using Fill
or even Span<T>
. The objective is made faster the whole process.
Here is a vectorized implementation of a FillIncremental
method. The Vector<T>
is a small container of values that a single CPU core can process in parallel. In my PC the Vector.IsHardwareAccelerated
is true
and the Vector<int>.Count
is 8
. Initially a Vector<int>
is filled with the values from 1 to 8. Then in each step all these values are incremented by 8 with a single +=
operation, and the incremented vector is copied to the next section of the target array. Finally the last few slots in the array that have not been filled by the vector (because the array Length
might not be divisible by 8), are filled with a simple for
loop:
/// <summary>
/// Fills an array of integers with incremented values, starting from the 'startValue'.
/// </summary>
public static void FillIncremental(int[] array, int startValue = 0)
{
ArgumentNullException.ThrowIfNull(array);
if (array.Length > 0 && startValue > (Int32.MaxValue - array.Length) + 1)
throw new ArgumentOutOfRangeException(nameof(startValue));
static void FillSimple(int[] array, int index, int length, int valueOffset)
{
int endIndex = index + length;
for (int i = index, j = index + valueOffset; i < endIndex; i++, j++)
array[i] = j;
}
if (!Vector.IsHardwareAccelerated || array.Length < Vector<int>.Count)
{
FillSimple(array, 0, array.Length, startValue);
return;
}
FillSimple(array, 0, Vector<int>.Count, startValue);
Vector<int> vector = new(array);
Vector<int> step = new(Vector<int>.Count);
int endIndex = array.Length - Vector<int>.Count + 1;
int i;
for (i = Vector<int>.Count; i < endIndex; i += Vector<int>.Count)
{
vector += step;
vector.CopyTo(array, i);
}
FillSimple(array, i, array.Length - i, startValue);
}
Usage example:
int n = 150_000;
int[] myArray = new int[n];
FillIncremental(myArray, 1);
In my PC the FillIncremental
method is about 4 times faster than filling the array with a simple for
loop (online benchmark).
I am not overly familiar with vectors, so it might be possible to optimize further the above approach.
Update: Enigmativity mentioned in the comments that the simple int[] myArray = Enumerable.Range(1, n).ToArray()
is actually faster than the above vectorized implementation. My own benchmark confirms this observation. Currently I have no idea why the Enumerable.Range
is so fast. According to the source code is should perform similarly to the FillSimple
above, so it should be around 3 times slower (taking into account the constant time of instantiating the array). It's around 15% faster instead (on .NET 7, .NET 6 and .NET 5). 🤔