Search code examples
pythonalgorithmradixconverters

How to convert a string into a custom base using 2 alphabet chars per letter


The functions below convert number 255 (base10) into 'FF' (base16) using base16 alphabet '0123456789ABCDEF'.

I'm having difficulties figuring out how to modify the functions such that they would use 2 characters of the alphabet per letter so that number 255 (base10) would convert to 'xFxF' (base16) using modified base16 alphabet 'x0x1x2x3x4x5x6x7x8x9xAxBxCxDxExF'.

def v2r(num, alphabet):
  """Convert base 10 number into a string of a custom base (alphabet)."""
  alphabet_length = len(alphabet)
  result = ''
  while num > 0:
    result = alphabet[num % alphabet_length] + result
    num  = num // alphabet_length
  return result


def r2v(data, alphabet):
  """Convert string of a custom base (alphabet) back into base 10 number."""
  alphabet_length = len(alphabet)
  num = 0
  for char in data:
    num = alphabet_length * num + alphabet[:alphabet_length].index(char)
  return num

base16 = v2r(255, '0123456789ABCDEF')
base10 = r2v(base16, '0123456789ABCDEF')
print(base16, base10)
# output: FF 255

# base16 = v2r(255, 'x0x1x2x3x4x5x6x7x8x9xAxBxCxDxExF')
# base10 = r2v(base16, 'x0x1x2x3x4x5x6x7x8x9xAxBxCxDxExF')
# print(base16, base10)
# output: xFxF 255

Solution

  • Here is a possible workaround. I think your bug came from a confusion with python definition of types and iterables.
    I've modified the base 16 alphabet, it is now a list of items. Then I also modified a bit the function to take this into account, and it looks like it works.

    def v2r(num, alphabet):
        """Convert base 10 number into a string of a custom base (alphabet)."""
        alphabet_length = len(alphabet)
        result = []
        while num > 0:
            result = [alphabet[num % alphabet_length]] + result
            num  = num // alphabet_length
        return result
    
    
    def r2v(data, alphabet):
        """Convert string of a custom base (alphabet) back into base 10 number."""
        alphabet_length = len(alphabet)
        num = 0
        for char in data:
            num = alphabet_length * num + alphabet.index(char)
        return num
    
    alphabet = [
        'x0','x1', 'x2', 'x3', 'x4', 'x5', 'x6', 'x7', 'x8',
        'x9', 'xA', 'xB', 'xC', 'xD', 'xE', 'xF'
    ]
    base16 = v2r(255, alphabet)
    base10 = r2v(base16, alphabet)
    print(''.join(base16), base10)
    #  xFxF 255
    

    Following the OP's comment: just declare the following alphabet:

    hexa = '0123456789abcdef'
    alphabet = [
        a+b for a in hexa for b in hexa
    ]