Search code examples
textutf-8base64decodewhitespace

Weird spaces in text after decoded in base64


To give context, every receipt must have QR code that contain (The name of store, VAT number, Date and time, total price, VAT price) when you read that QR code it give a base64 text that have the previous information but it aslo have that weird spaces in between that i don't understand and some how it required by the official apps that read the QR code

Base64 example text:

AR5Vbml2ZXJzYWwgQ29sZCBTdG9yZSBUcmRnLiBDby4CDzMwMDU2Mjg2MjQxMDAwMwMTMjAyMy0wNC0yMiAxODoyNzoxNQQFMjUuODgFBDMuMzg=

enter image description here


Solution

  • By looking at the data the two bytes before each string are a string number (1-5 in the data), and the length of the following string.

    Here's Python code to extract the data:

    import base64
    
    s = 'AR5Vbml2ZXJzYWwgQ29sZCBTdG9yZSBUcmRnLiBDby4CDzMwMDU2Mjg2MjQxMDAwMwMTMjAyMy0wNC0yMiAxODoyNzoxNQQFMjUuODgFBDMuMzg='
    data = base64.b64decode(s)
    while data:
        num = data[0]
        size = data[1]
        string = data[2:2+size].decode('ascii')  # assume data is ASCII-encoded
        print(f'{num=} {size=} {string=}')
        data = data[size+2:]  # remove num/size/string from the data
    

    Output:

    num=1 size=30 string='Universal Cold Store Trdg. Co.'
    num=2 size=15 string='300562862410003'
    num=3 size=19 string='2023-04-22 18:27:15'
    num=4 size=5 string='25.88'
    num=5 size=4 string='3.38'