Search code examples
coptimizationrustradix

Is there a shortcut or special-case when transforming long numbers from base-256 to base-255 and back?


I have many large numbers which are several hundred bytes long.

How can I efficiently transform these from base-256 into base-255, such that the numbers become up to one byte longer but don't use the value (or digit) 0xFF in any output byte?

(I have coded a general solution, but it's complexity O(n*n), as each input digit can affect many of the output digits.)

Is there a shortcut or special-case when transforming long numbers from base-256 to base-255 and back?


Solution

  • To speed this up you want to think about even larger bases.

    The first step would be to convert the original "base 256" into either "base 1<<32" or "base 1<<64" (depending on what your CPU is). This reduces the number of digits in the number by a factor of 4 (or 8). If byte order ("endianess") is compatible and "length in bytes" is a multiple of 8 (which should be easy to ensure in earlier code) this may involve not actually doing anything (some hassle with type conversion and the strict aliasing rule in the source code that becomes zero bytes of machine code).

    In a related way; instead of converting directly to "base 255" you could convert to "base 255*255*255*255" first; and then split each "large digit" into a pair of smaller "base 255*255" digits, then split those into smaller "base 255" digits. This splitting can be very fast with SIMD (split multiple digits at the same time).

    These changes should make it almost 16 times faster. Essentially, if your division algorithm is "O(n*n)" it will become "O(n/4 * n/4)" or "O(n*n/16)". Of course "O(n*n/16)" is considered the same as "O(n*n)" despite being a lot faster.