I'm trying to convert array type of ushort[4k*4k] of values 0-65k to similiar array type of int[] of same values.
It seems to mee that Buffer.BlockCopy is the fastest way to do that. I'm trying the following code:
ushort[] uPixels = MakeRandomShort(0, 65000, 4000 * 4000);// creates ushort[] array
int[] iPixels = new int[4000 * 4000];
int size = sizeof(ushort);
int length = uPixels.Length * size;
System.Buffer.BlockCopy(uPixels, 0, iPixels, 0, length);
But iPixels stores some strange values in very strange range +-1411814783, +- 2078052064, etc.
What is wrong, and what I need to do to make it work properly?
thanks!
There is a related discussion on GitHub.
To copy an ushort[]
to an int[]
array does not work with a routine tuned for contiguous memory ranges.
Basically, you have to clear the upper halves of the target int cells. Then, some sort of (parallelized?) loop is needed to copy the actual data.
It could be possible to use unsafe code with pointers advanced in steps of two bytes. The implementation of Buffer.BlockCopy is not visible in the Microsoft source repository. It might make sense to hunt for the source and modify it.
Update
I implemented two C++
functions and did a rough measurement of the resulting performance compared to the C#
loop copy.
C# implementation
const int LEN = 4000 * 4000;
for (int i = 0; i < LEN; i++)
{
iPixels[i] = uPixels[i];
}
C++ implementation SpeedCopy1
// Copy loop with casting from unsigned short to int
__declspec(dllexport) void SpeedCopy1(unsigned short *uArray, int * iArray, int len)
{
for (int i = 0; i < len; i++)
{
*iArray++ = *uArray++;
}
}
C++ implementation SpeedCopy2
/// Copy loop with unsigned shorts
/// Clear upper half of int array elements in advance
__declspec(dllexport) void SpeedCopy2(unsigned short* uArray, int* iArray, int len)
{
unsigned short* up = (unsigned short*)iArray;
memset(iArray, 0, sizeof(int) * len);
for (int i = 0; i < len; i++)
{
*up = *uArray++;
up += 2;
}
}
Resulting times:
C# loop copy 27 ms
SpeedCopy1 9 ms
SpeedCopy2 18 ms
Compared to the C# loop, the external C++ function can reduce the copy time down a third.
It remains to be shown, what effect could be gained by multi-threading.