Basically I'm trying to build a unique 64bit id from coordinates that I can then pull apart later on. These operations will be performed billions of times in short order, so speed is critical. Here is what I am after. I have 4 - 32 bit integers but only the bottom 16 bits are relevant. I want to concatenate the bottom 16 bits into one 64 bit 'long' (doesn't matter if it's signed or not, since the bits are the same). So if I have:
largeId = 0000 0000 0000 0000 0000 0000 1000 1000
x = 0000 0000 0000 0000 0000 0000 1100 1100
y = 0000 0000 0000 0000 0000 0000 1110 1110
z = 0000 0000 0000 0000 0000 0000 1111 1111
it will become:
Id = 0000 0000 1000 1000 0000 0000 1100 1100 0000 0000 1110 1110 0000 0000 1111 1111
I've written a few routines that produce the desired results (i.e. builds and pulls apart) and have timed them using 500^3 iterations to try to find the fastest routine. The routine that decodes the 64 bit number back into 4 int variables runs at about 43% of the time required to encode them. How can I speed up the encoding??
Routines: (updated with variations of Paul Smiths suggestions below)
public static long GetCombinedId(int largeId, int x, int y, int z)
{
var _largeId = (long)largeId;
var _x = (long)x;
var _y = (long)y;
var _z = (long)z;
return (_largeId << 48) | (_x << 32) | (_y << 16) | _z;
}
public static long GetCombinedId2(int largeId, int x, int y, int z)
{
return ((long)largeId << 48) | ((long)x << 32) | ((long)y << 16) | (long)z;
}
public static long GetCombinedId3(int largeId, int x, int y, int z)
{
unchecked
{
return ((long)(largeId << 16 | x) << 32) | (y << 16 | z );
}
}
public static void GetCoordinates(long id, out int largeId, out int x, out int y, out int z)
{
largeId = (int)(id >> 48);
x = (int)((id >> 32) & 0x0000_0000_0000_FFFF);
y = (int)((id >> 16) & 0x0000_0000_0000_FFFF);
z = (int)(id & 0x0000_0000_0000_FFFF);
}
public static void GetCoordinates2(long id, out int largeId, out int x, out int y, out int z)
{
largeId = (int)(id >> 48);
x = (int)((id << 16 ) >> 48);
y = (int)((id << 32 ) >> 48);
z = (int)((id << 48 ) >> 48);
}
Variations of Paul Smith's technique described in answers section
[StructLayout(LayoutKind.Explicit)]
public struct Mapper
{
[FieldOffset(0)] public Int64 Combined;
[FieldOffset(0)] public Int16 Short0;
[FieldOffset(2)] public Int16 Short1;
[FieldOffset(4)] public Int16 Short2;
[FieldOffset(6)] public Int16 Short3;
}
public static long GetId4(int largeId, int x, int y, int z)
{
Mapper mapper = new Mapper()
{
Short0 = (Int16)z,
Short1 = (Int16)y,
Short2 = (Int16)x,
Short3 = (Int16)largeId
};
return mapper.Combined;
}
private static Mapper _mapper = new Mapper();
public static long GetId5(int largeId, int x, int y, int z)
{
_mapper.Short0 = (Int16)z;
_mapper.Short1 = (Int16)y;
_mapper.Short2 = (Int16)x;
_mapper.Short3 = (Int16)largeId;
return _mapper.Combined;
}
[StructLayout(LayoutKind.Explicit)]
public struct Mapper2
{
[FieldOffset(0)] public Int64 Combined;
[FieldOffset(0)] public Int32 Integer0;
[FieldOffset(4)] public Int32 Integer1;
}
private static Mapper2 _mapper2 = new Mapper2();
public static long GetId6(int largeId, int x, int y, int z)
{
_mapper2.Integer0 = y << 16 | z; //dangerous because we aren't checking upper bits of z
_mapper2.Integer1 = largeId << 16 | x; //dangerous because we aren't checking upper bits of x
return _mapper2.Combined;
}
Results:
GetId1 = 2168ms
GetId2 = 1824ms
GetId3 = 1679ms
GetId4 = 2217ms
GetId5 = 2008ms
GetId6 = 1757ms
GetCoord1 = 785ms
GetCoord2 = 865ms
Routine1: 71776849217913036 binary: 11111111000000001010101000000000101110110000000011001100
Routine2: 71776849217913036 binary: 11111111000000001010101000000000101110110000000011001100
Routine3: 71776849217913036 binary: 11111111000000001010101000000000101110110000000011001100
Routine4: 71776849217913036 binary: 11111111000000001010101000000000101110110000000011001100
Routine5: 71776849217913036 binary: 11111111000000001010101000000000101110110000000011001100
Routine6: 71776849217913036 binary: 11111111000000001010101000000000101110110000000011001100
255, 170, 187, 204
255, 170, 187, 204
Is there a better/faster way to encode the 4 integers into a 64 bit long?
(fyi...the BitConverter class is paralyzingly slow and was removed because it isn't feasible)
5 days later and a question popped into my head while taking a shower...What if the time is being eaten up by moving the return value off the local stack into the calling procedure stack? Turns out...it was.
The new method below takes the fastest method from above (method #3) and instead of returning the variable (which causes the "stack to stack" copy), I pass in a the return value as an "out" reference. This allows the calculation to be performed directly into the resultant variable from the calling procedure.
Doing this...I can now encode faster than I can decode, which was the goal all along. Below is the new routine and speed comparisons.
public static void GetId7(int largeId, int x, int y, int z, out long id)
{
id = ((long)(largeId << 16 | x) << 32) | (y << 16 | z);
}
Speed comparisons. GetId7 shows the new results:
GetId1 = 2282ms
GetId2 = 1910ms
GetId3 = 1782ms
GetId4 = 2306ms
GetId5 = 2092ms
GetId6 = 1816ms
GetId7 = 831ms
GetCoord1 = 828ms
GetCoord2 = 930ms
Routine1: 71776849217913036 binary: 11111111000000001010101000000000101110110000000011001100
Routine2: 71776849217913036 binary: 11111111000000001010101000000000101110110000000011001100
Routine3: 71776849217913036 binary: 11111111000000001010101000000000101110110000000011001100
Routine4: 71776849217913036 binary: 11111111000000001010101000000000101110110000000011001100
Routine5: 71776849217913036 binary: 11111111000000001010101000000000101110110000000011001100
Routine6: 71776849217913036 binary: 11111111000000001010101000000000101110110000000011001100
Routine7: 71776849217913036 binary: 11111111000000001010101000000000101110110000000011001100
255, 170, 187, 204
255, 170, 187, 204
Not that it's needed, but I am curious if anyone can get it even faster.