Search code examples
pythonencryptionbytetype-conversionoperation

byte operations (XOR) in python


#!/usr/bin/env python3

import binascii


var=binascii.a2b_qp("hello")
key=binascii.a2b_qp("supersecretkey")[:len(var)]

print(binascii.b2a_qp(var))
print(binascii.b2a_qp(key))


# here I want to do an XOR operation on the bytes in var and key and place them in 'encryption': encryption=var XOR key

print(binascii.b2a_qp(encrypted))

If someone could enlighten me on how I could accomplish this I would be very happy. Very new to the whole data-type conversions so yeah... reading through the python wiki is not as clear as I would like.


Solution

  • It looks like what you need to do is XOR each of the characters in the message with the corresponding character in the key. However, to do that you need a bit of interconversion using ord and chr, because you can only xor numbers, not strings:

    >>> encrypted = [ chr(ord(a) ^ ord(b)) for (a,b) in zip(var, key) ] 
    >>> encrypted
    ['\x1b', '\x10', '\x1c', '\t', '\x1d']
    
    >>> decrypted = [ chr(ord(a) ^ ord(b)) for (a,b) in zip(encrypted, key) ]
    >>> decrypted
    ['h', 'e', 'l', 'l', 'o']
    
    >>> "".join(decrypted)
    'hello'
    

    Note that binascii.a2b_qp("hello") just converts a string to another string (though possibly with different encoding).

    Your approach, and my code above, will only work if the key is at least as long as the message. However, you can easily repeat the key if required using itertools.cycle:

    >>> from itertools import cycle
    >>> var="hello"
    >>> key="xy"
    
    >>> encrypted = [ chr(ord(a) ^ ord(b)) for (a,b) in zip(var, cycle(key)) ]
    >>> encrypted
    ['\x10', '\x1c', '\x14', '\x15', '\x17']
    
    >>> decrypted = [ chr(ord(a) ^ ord(b)) for (a,b) in zip(encrypted, cycle(key)) ]
    >>> "".join(decrypted)
    'hello'
    

    To address the issue of unicode/multi-byte characters (raised in the comments below), one can convert the string (and key) to bytes, zip these together, then perform the XOR, something like:

    >>> var=u"hello\u2764"
    >>> var
    'hello❤'
    
    >>> encrypted = [ a ^ b for (a,b) in zip(bytes(var, 'utf-8'),cycle(bytes(key, 'utf-8'))) ]
    >>> encrypted
    [27, 16, 28, 9, 29, 145, 248, 199]
    
    >>> decrypted = [ a ^ b for (a,b) in zip(bytes(encrypted), cycle(bytes(key, 'utf-8'))) ]
    >>> decrypted
    [104, 101, 108, 108, 111, 226, 157, 164]
    
    >>> bytes(decrypted)
    b'hello\xe2\x9d\xa4'
    
    >>> bytes(decrypted).decode()
    'hello❤'