python python-2.7 cryptography xor bitwise-xor

XoR two Hexadecimal

So I am trying to find the key of one time-pad and I have 10 ciphertexts.(the plaintext letters are encoded as 8-bit ASCII and the given ciphertexts are written in hex; and I'm using python 2.7)

the idea is that when you xor a character with a space the character gets uppercase or lowercase, and when you xor x with x it returns zero so when I xor two character of to ciphertexts I xor the key with the key and the message character with the message character. so I wrote this code for xoring two hex.

 def hex_to_text(s):
     string=binascii.unhexlify(s)
     return string

def XoR (a,b):
    a="0x"+a
    b="0x"+b
    xor=chr(int(a,16) ^ int(b,16))
    return hex_to_text(xor[2:])

when the key is an even number it the xor function works correct but when the key is odd it does not return the same character uppercase or lowercase.

what am i doing wrong?

Solution

a general idea on how to solve this, disregarding python:

lets start with saying a char is 8 bit ascii

if you look at the first char form the first ciphertext, you will probably notice that it is outside of the ascii values for plain text which one could say are a-z 0x61-0x7a A-Z 0x41-0x5a

there is a high probability that you only have to take values into account that, xored with this char, make it something inside the specified value ranges

the same holds for the other 9 texts and their respective first char

and, interestingly, the list of possible key values for this char has to hold for every ciphertext with the same key, so each and every ciphertext we look at reduces the range further

now, what can you do with this approach?

write a function that takes 2 parameters (bytes) and tests if the result of a xor falls into the specified range, if yes, return 1, if no return 0

now make 3 nested loops to call this function

outer loop (X) goes through the char positions in the ciphertext middle loop (Y) goes from 0 to 255 inner loop (Z) goes through the ciphertexts

in the inner loop call your function with parameter 1 being the X character of your Z ciphertext and parameter 2 being Y

now what to do with the result:

you want to have a dictionary/lookup table that per position X holds an array of 255 elements

the index on these elements will be Y the value for these elements will be the sum of your function results for all Z

in the end what you will have is for every position in your ciphertext, an array that tells you for each keybyte how likely it is the key ... the higher the value the higher the probability of being the key byte

then for each position in your ciphertext order the possible keybytes by their probability and partition them by probability

then take a chunk of all ciphertexts, lets say the first 8 to 16 chars, and calculate the plaintext for all keys in the highest probability group

store key chunk and plaintext chunk together in a list

now test your list of possible plaintexts against a common dictionary, and again rate them 1 if they contain words that can be found in a dictionary and 0 otherwise ... sum up for all different ciphertexts ... (or use another metric to rate how good a key is)

order the key chunks by the highest value (read: the key that potentialy solved the most chunks across all ciphertexts comes first and the one that produced garbage comes last) and continue with the next chunk ...

repeat this with bigger chunks, selecting not keybytes but the next smaller size of key chunks, until your chunksize gets to the ciphertext size...

of course this is an automated way to find a likely key, and there is some implementation work until you have a completely automated solution. if you just want to solve this 10 ciphertexts, you can abort the approach after the likely keybytes or the first chunks, and do the rest by hand ...