Search code examples
pythonstringradixnumber-theory

Converting from a string to a number in base-64


So, I am trying to write a program to decode 6-character base-64 numbers.

Here is the problem statement:

Return the 36-bit number represented as a base-64 number in reverse order by the 6-character string s where the order of the 64 numerals is: 0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz-+

i.e.

decode('000000') → 0

decode('gR1iC9') → 9876543210

decode('++++++') → 68719476735

I would like to do this WITHOUT strings.

The easiest way to do this would be to create the inverse of the following function:

def get_digit(d):
    ''' Convert a base 64 digit to the desired character '''
    if 0 <= d <= 9:
        # 0 - 9
        c = 48 + d
    elif 10 <= d <= 35:
        # A - Z
        c = 55 + d
    elif 36 <= d <= 61:
        # a - z
        c = 61 + d
    elif d == 62:
        # -
        c = 45
    elif d == 63:
        # +
        c = 43
    else:
        # We should never get here
        raise ValueError('Invalid digit for base 64: ' + str(d)) 
    return chr(c)

# Test `digit`
print(''.join([get_digit(d) for d in range(64)]))

def encode(n):
    ''' Convert integer n to base 64 '''
    out = []
    while n:
        n, r = n // 64, n % 64
        out.append(get_digit(r))
    while len(out) < 6:
        out.append('0')
    return ''.join(out)

# Test `encode`
for i in (0, 9876543210, 68719476735):
    print(i, encode(i))

Output

0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz-+
0 000000
9876543210 gR1iC9
68719476735 ++++++

Which is actually from PM 2Ring on this page.

How do I write the inverse of this program?

A start:

The inverse of get_digits as above is below:

def inv_get_digit(c):

    if 0 <= c <= 9:
        d = ord(c) - 48
    elif 'A' <= c <= 'Z':
        d = ord(c) - 55
    elif 'a' <= c <= 'z'
        d = ord(c) - 61
    elif c == '+':
        d = 63
    elif c == '-':
        d = 62
    else:
        raise ValueError('Invalid Input' + str(c))
    return d


def decode(n):

    out = []
    while n:
        n, r= n % 10, n ** (6-len(str))
        out.append(get_digit(r))
    while len(out) < 10:
        out.append('0')
    return ''.join(out)

Solution

  • Here's a program that combines my old code with some new code to perform the inverse operations.

    You have a syntax error in your inv_get_digit function: you left the colon off the end of an elif line. And there's no need to do str(c), since c is already a string.

    I'm afraid that your decode function doesn't make much sense. It's supposed to take a string as input and return an integer. Please see a working version below.

    def get_digit(d):
        ''' Convert a base 64 digit to the desired character '''
        if 0 <= d <= 9:
            # 0 - 9
            c = 48 + d
        elif 10 <= d <= 35:
            # A - Z
            c = 55 + d
        elif 36 <= d <= 61:
            # a - z
            c = 61 + d
        elif d == 62:
            # -
            c = 45
        elif d == 63:
            # +
            c = 43
        else:
            # We should never get here
            raise ValueError('Invalid digit for base 64: ' + str(d)) 
        return chr(c)
    
    print('Testing get_digit') 
    digits = ''.join([get_digit(d) for d in range(64)])
    print(digits)
    
    def inv_get_digit(c):
        if '0' <= c <= '9':
            d = ord(c) - 48
        elif 'A' <= c <= 'Z':
            d = ord(c) - 55
        elif 'a' <= c <= 'z':
            d = ord(c) - 61
        elif c == '-':
            d = 62
        elif c == '+':
            d = 63
        else:
            raise ValueError('Invalid input: ' + c)
        return d
    
    print('\nTesting inv_get_digit') 
    nums = [inv_get_digit(c) for c in digits]
    print(nums == list(range(64)))
    
    def encode(n):
        ''' Convert integer n to base 64 '''
        out = []
        while n:
            n, r = n // 64, n % 64
            out.append(get_digit(r))
        while len(out) < 6:
            out.append('0')
        return ''.join(out)
    
    print('\nTesting encode')
    numdata = (0, 9876543210, 68719476735)
    strdata = []
    for i in numdata:
        s = encode(i)
        print(i, s)
        strdata.append(s)
    
    def decode(s):
        out = []
        n = 0
        for c in reversed(s):
            d = inv_get_digit(c)
            n = 64 * n + d
        return n
    
    print('\nTesting decode')
    for s, oldn in zip(strdata, numdata):
        n = decode(s)
        print(s, n, n == oldn)
    

    output

    Testing get_digit
    0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz-+
    
    Testing inv_get_digit
    True
    
    Testing encode
    0 000000
    9876543210 gR1iC9
    68719476735 ++++++
    
    Testing decode
    000000 0 True
    gR1iC9 9876543210 True
    ++++++ 68719476735 True