Search code examples
javascriptv8bigintspidermonkey

V8 (or other JS engine) BigInt Implementation - Displaying as Decimal


I'm wondering if someone might be able to explain a specific aspect of the JavaScript BigInt implementation to me.

The overview implementation I understand - rather than operating in base 10, build an array representing digits effectively operating in base 2^32/2^64 depending on build architecture.

What I'm curious about is the display/console.log implementation for this type - it's incredibly fast for most common cases, to the point where if you didn't know anything about the implementation you'd probably assume it was native. But, knowing what I do about the implementation, it's incredible to me that it's able to do the decimal cast/string concatenation math as quickly as it can, and I'm deeply curious how it works.

A moderate look into bigint.cc and bigint.h in the Chromium source has only confused me further, as there are a number of methods whose signatures are defined, but whose implementations I can't seem to find.

I'd appreciate even being pointed to another spot in the Chromium source which contains the decimal cast implementation.


Solution

  • (V8 developer here.)

    @Bergi basically provided the relevant links already, so just to sum it up:

    Formatting a binary number as a decimal string is a "base conversion", and its basic building block is:

    while (number > 0) {
      next_char = "0123456789"[number % 10];
      number = number / 10;  // Truncating integer division.
    }
    

    (Assuming that next_char is also written into some string backing store; this string is being built up from the right.)

    Special-cased for the common situation that the BigInt only had one 64-bit "digit" to begin with, you can find this algorithm in code here.

    The generalization for more digits and non-decimal radixes is here; it's the same algorithm.

    This algorithm runs sufficiently fast for sufficiently small BigInts; its problem is that it scales quadratically with the length of the BigInt. So for large BigInts (where some initial overhead easily pays for itself due to enabling better scaling), we have a divide-and-conquer implementation that's built on better-scaling division and multiplication algorithms.

    When the requested radix is a power of two, then no such heavy machinery is necessary, because a linear-time implementation is easy. That's why some_bigint.toString(16) is and always will be much faster than some_bigint.toString() (at least for large BigInts), so when you need de/serialization rather than human readability, hex strings are preferable for performance.

    if you didn't know anything about the implementation you'd probably assume it was native

    What does that even mean?