Search code examples
javascripttypescriptbigintegerdata-conversionuint

Converting Uint8Array to BigInt in Javascript


I have found 3 methods to convert Uint8Array to BigInt and all of them give different results for some reason. Could you please tell me which one is correct and which one should I use?

  1. Using bigint-conversion library. We can use bigintConversion.bufToBigint() function to get a BigInt. The implementation is as follows:
export function bufToBigint (buf: ArrayBuffer|TypedArray|Buffer): bigint {
  let bits = 8n
  if (ArrayBuffer.isView(buf)) bits = BigInt(buf.BYTES_PER_ELEMENT * 8)
  else buf = new Uint8Array(buf)

  let ret = 0n
  for (const i of (buf as TypedArray|Buffer).values()) {
    const bi = BigInt(i)
    ret = (ret << bits) + bi
  }
  return ret
}
  1. Using DataView:
let view = new DataView(arr.buffer, 0);
let result = view.getBigUint64(0, true);
  1. Using a FOR loop:
let result = BigInt(0);
for (let i = arr.length - 1; i >= 0; i++) {
  result = result * BigInt(256) + BigInt(arr[i]);
}

I'm honestly confused which one is right since all of them give different results but do give results.


Solution

  • I'm fine with either BE or LE but I'd just like to know why these 3 methods give a different result.

    One reason for the different results is that they use different endianness.

    Let's turn your snippets into a form where we can execute and compare them:

    let source_array = new Uint8Array([
        0xff, 0xee, 0xdd, 0xcc, 0xbb, 0xaa, 0x99, 0x88, 
        0x77, 0x66, 0x55, 0x44, 0x33, 0x22, 0x11]);
    let buffer = source_array.buffer;
    
    function method1(buf) {
      let bits = 8n
      if (ArrayBuffer.isView(buf)) {
        bits = BigInt(buf.BYTES_PER_ELEMENT * 8)
      } else {
        buf = new Uint8Array(buf)
      }
    
      let ret = 0n
      for (const i of buf.values()) {
        const bi = BigInt(i)
        ret = (ret << bits) + bi
      }
      return ret
    }
    
    function method2(buf) {
      let view = new DataView(buf, 0);
      return view.getBigUint64(0, true);
    }
    
    function method3(buf) {
      let arr = new Uint8Array(buf);
      let result = BigInt(0);
      for (let i = arr.length - 1; i >= 0; i--) {
        result = result * BigInt(256) + BigInt(arr[i]);
      }
      return result;
    }
    
    console.log(method1(buffer).toString(16));
    console.log(method2(buffer).toString(16));
    console.log(method3(buffer).toString(16));
    

    Note that this includes a bug fix for method3: where you wrote for (let i = arr.length - 1; i >= 0; i++), you clearly meant i-- at the end.

    For "method1" this prints: ffeeddccbbaa998877665544332211
    Because method1 is a big-endian conversion (first byte of the array is most-significant part of the result) without size limit.

    For "method2" this prints: 8899aabbccddeeff
    Because method2 is a little-endian conversion (first byte of the array is least significant part of the result) limited to 64 bits.
    If you switch the second getBigUint64 argument from true to false, you get big-endian behavior: ffeeddccbbaa9988.
    To eliminate the size limitation, you'd have to add a loop: using getBigUint64 you can get 64-bit chunks, which you can assemble using shifts similar to method1 and method3.

    For "method3" this prints: 112233445566778899aabbccddeeff
    Because method3 is a little-endian conversion without size limit. If you reverse the for-loop's direction, you'll get the same big-endian behavior as method1: result * 256n gives the same value as result << 8n; the latter is a bit faster.
    (Side note: BigInt(0) and BigInt(256) are needlessly verbose, just write 0n and 256n instead. Additional benefit: 123456789123456789n does what you'd expect, BigInt(123456789123456789) does not.)

    So which method should you use? That depends on:
    (1) Do your incoming arrays assume BE or LE encoding?
    (2) Are your BigInts limited to 64 bits or arbitrarily large?
    (3) Is this performance-critical code, or are all approaches "fast enough"?

    Taking a step back: if you control both parts of the overall process (converting BigInts to Uint8Array, then transmitting/storing them, then converting back to BigInt), consider simply using hexadecimal strings instead: that'll be easier to code, easier to debug, and significantly faster. Something like:

    function serialize(bigint) {
      return "0x" + bigint.toString(16);
    }
    function deserialize(serialized_bigint) {
      return BigInt(serialized_bigint);
    }