Search code examples
pythonpython-3.xbase-conversionbase62

How to fix the code for base62 encoding with Python3?


I am creating a python script for base62 encoding. However, for some reason, my code doesn't produce the correct answer. Correct answer is LpuPe81bc2w, but i get LpuPe81bc0w. If you can review my code and see if I can do any different. Please let me know. I can't use pybase62.

I really want to know why my code doesn't work because I want to understand fundamental. Preferably

NO PyBASE62 and I am a beginner.

base_62 = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz"
BASE = len(base_62)

def to_base_62(number):  
 rete=''
 while number != 0:    
  rete = (base_62[number%BASE])+rete  
  number = int(number/BASE)
 return rete

print (to_base_62(18327995462734721974))

Solution

  • You ran afoul of floating-point accuracy. I inserted a simple tracking statement into your code:

    def to_base_62(number):  
        rete=''
        while number != 0: 
            rete = base_62[number%BASE]+rete    
            print(len(rete), number/BASE)
            number = int(number/BASE)
        return rete
    

    Output:

    1 2.956128300441084e+17
    2 4767948871679168.0
    3 76902401156115.61
    4 1240361308969.5967
    5 20005827564.01613
    6 322674638.12903225
    7 5204429.645161291
    8 83942.40322580645
    9 1353.9032258064517
    10 21.822580645161292
    11 0.3387096774193548
    LpuPe81bc0w
    

    The floating-point division doesn't keep enough digits to differentiate the ones you need for a number this large. Instead, use integer division:

    while number != 0: 
        rete = base_62[number%BASE]+rete    
        print(len(rete), number // BASE)
        number = number // BASE
    

    Output:

    1 295612830044108418
    2 4767948871679168
    3 76902401156115
    4 1240361308969
    5 20005827564
    6 322674638
    7 5204429
    8 83942
    9 1353
    10 21
    11 0
    LpuPe81bc2w