Search code examples
pythonencryptioncharacter-encodingpycryptodome

ValueError: Data must be aligned to block boundary in ECB mode (or additional backslashes from encoding of encrypted text)


I have this code:

    from Crypto.Cipher import DES

    # Encryption part

    key = b'abcdefgh'

    def pad(text):
       while len(text) % 8 != 0:
           text += b' '
       return text

    des = DES.new(key, DES.MODE_ECB)
    text = b'Secret text'
    padded_text = pad(text)

    encrypted_text = des.encrypt(padded_text)
    print(encrypted_text) # FIRST

    # Decryption part

    that_encrypted_text = input().encode('utf8')
    # This print shows the problem---------------
    print(that_encrypted_text) # SECOND
    # This print shows the problem --------------
    data = des.decrypt(that_encrypted_text)
    print(data)
  1. From the FIRST print we can see: b'.\x12\x7f\xcf\xad+\xa9\x0c\xc4\xde\x05\x15\xef\x7f\x16\xa0'

  2. Fill in the input(): .\x12\x7f\xcf\xad+\xa9\x0c\xc4\xde\x05\x15\xef\x7f\x16\xa0

  3. From the SECOND print we can see: b'.\\x12\\x7f\\xcf\\xad+\\xa9\\x0c\\xc4\\xde\\x05\\x15\\xef\\x7f\\x16\\xa0'

And after this (because of additional backslashes) an error appears:

ValueError: Data must be aligned to block boundary in ECB mode

Why do additional backslashes appear from encoding and how to get rid of them so that the message was decrypted? I want both parts of program: encryption and decryption to work separately. That's why there is input() for an encrypted text.


Solution

  • Fill in the input(): .\x12\x7f\xcf\xad+\xa9\x0c\xc4\xde\x05\x15\xef\x7f\x16\xa0

    is equivalent to r'.\x12\x7f\xcf\xad+\xa9\x0c\xc4\xde\x05\x15\xef\x7f\x16\xa0' (and it's origin for doubled backslashes in your SECOND print).

    Use

    that_encrypted_text = (input().encode( 'raw_unicode_escape')
                                  .decode( 'unicode_escape')
                                  .encode( 'latin1'))
    

    See how Python specific text encodings raw_unicode_escape and unicode_escape manipulate with backslashes (and note the role of latin1 encoding there).