Search code examples
mathencodingnumber-formattingradix

Base32 alphabet that is an overlapping superset of hexadecimal? (math/CS)


This is maybe more of a math question, but I'm stumped on it:

Let's say I have an 8-digit hex string. That can represent values from 0 to 2^32-1. Now let's say I want to have an 8-digit string of another base like base32. Is it possible to construct an alphabet for base32 (or another base) that is a strict superset of hexadecimal so that any hex string below 2^32-1 will decode via base32 to the same value and only larger values >=2^32 start incorporating base32 characters outside the hex range?

In other words is it possible to "upgrade" from base 16 to a higher numbered base in a way that is backward compatible with hex identifiers?


Solution

  • You can assign numbers to 8-character strings however you like.

    There are 232 8-character hex strings, to which you can certainly assign their hex values.

    There are 240 8-character strings with characters in, say, 0123456789ABCDEFGHJKMNPQRSTUVWXY. 232 are hex strings, and the remaining 240 - 232 strings can be assigned any numbers you like.

    You won't be able to assign them numbers via a "normal" decimal-like system, however, because hex requres "10" to be 16, not 32. There are ways that aren't that hard, however. For example, given a 40-bit number:

    1. Convert the lower 32-bits to 8 character HEX.
    2. Assign one of the remaining bits to each character, and for each one bit, add 'G' to the corresponding character, changing its range from '0-F' to 'G-Y'

    Now you have a string for each 40-bit number, and the smaller ones have the same strings as their hex representations.