Search code examples
.netshiftint64

Combine Two Int32 Into An Int64


Have Dictionary <Int64, byte> that gets used a lot. I mean in a loop that runs for days in a big data load. The Int64 comess from two Int32. The byte happens to be the distance (count) between those two Int32 from many very long lists.

What I need to do in this loop is

  • Generate the key
  • If key does not exists in the Dictionary then insert key and value
  • If key does exists and new value (byte) is less than the existing value then replace the existing value with the new value

Right now I am using straight math to generate the key and I know there is faster way but I cannot figure it out. I put shift as a tag as I think that is how to optimize it but I cannot figure it out.

Then when the loop is complete I need to extract the two Int32 from the Int64 to insert the data into a database.

Thanks

Per comment the math I use to combine two Int32 into one Int64

        Int64 BigInt;
        Debug.WriteLine(Int32.MaxValue);
        Int32 IntA = 0;
        Int32 IntB = 1;
        BigInt = ((Int64)IntA * Int32.MaxValue) + IntB;
        Debug.WriteLine(BigInt.ToString());
        IntA = 1;
        IntB = 0;
        BigInt = ((Int64)IntA * Int32.MaxValue) + IntB;
        Debug.WriteLine(BigInt.ToString());
        IntA = 1;
        IntB = 1;
        BigInt = ((Int64)IntA * Int32.MaxValue) + IntB;
        Debug.WriteLine(BigInt.ToString());

And the best key may not be an Int64. What I have is two Int32 that together form a key. And a value of byte. I need fast lookup on that composite key. Dictionary is fast but it does not support composite key so I create a single key that is actually a composite key. In SQL Int32A, Int32B form the PK.

The reason I don't use a composite key is I want the lookup speed of Dictionary and to my knowledge Dictionary does not support composite key. This is production code. In the SQL table there is actually a third key (Int32 sID, Int32 IntA, Int32 IntB). In this parser I am only dealing with one sID at a time (and sIDs are processed in order). I started with composite key lookup to SQL (billions in a run). When I pulled IntA, IntB out to Dictionary to process a single sID then load to SQL at the completion of each sID I got a 100:1 performance improvement. Part of the performance improvement is insert as when I insert from the Dictionary I can insert in PK order. The new IntA and IntB are not produced sorted by the parse so direct insert into SQL would severely fragment the index and I would need to rebuild the index at the end of a run.


Solution

  • Sounds like you just want a shift. Personally I find it simpler to think about bitshifting when using unsigned types instead of signed ones:

    // Note: if you're in a checked context by default, you'll want to make this
    // explicitly unchecked
    uint u1 = (uint) int1;
    uint u2 = (uint) int2;
    
    ulong unsignedKey = (((ulong) u1) << 32) | u2;
    long key = (long) unsignedKey;
    

    And to reverse:

    ulong unsignedKey = (long) key;
    uint lowBits = (uint) (unsignedKey & 0xffffffffUL);
    uint highBits = (uint) (unsignedKey >> 32);
    int i1 = (int) highBits;
    int i2 = (int) lowBits;
    

    It's entirely possible that you don't need all these conversions to unsigned types. It's more for my sanity than anything else :)

    Note that you need to cast u1 to a ulong so that the shifting works in the right space - shifting a uint by 32 bits would do nothing.

    Note that that's a way of combining two 32-integers to get a 64-bit integer. It's not the only way by any means.

    (Side-note: Bas's solution works perfectly well - I'm just always somewhat uncomfortable with that sort of approach, for no specific reason.)