So I am currently working with the library: simple-crypt.
I have managed to transform a certain input string into it´s binary string.
pw_data = input("Please type in your p!") # enter password
pw_data_confirmed = input("Please confirm!")
_platform = input("Please tell me the platform!") # belonging platform
if pw_data == pw_data_confirmed: # check confirmed pw
print("Received!")
salt_data = "AbCdEfkhl" # salt key
ciphertext = encrypt(salt_data, pw_data.encode("utf8")) # encrypt pw with salt key
Binary string e.g: b'sc\x00\x02X\xd8\x8ez\xbfB\x03s\xc5\x8bm\xecp\x19\x8d\xd6lqW\xf1\xc3\xa4y\x8f\x1aW)\x9bX\xfc\x0e\xa4\xf2ngJj/]{\x80\x06-\x07\x8cQ\xeef\x0b\x02?\x86\x19\x98\x94eW\x08}\x1d8\xdb\xe57\xf7\x97\x81\xb6\xc7\x08\n^\xc9\xc0'
This binary string will then be stored in a word document.
The problem now is: As soon as I read the document and get this specific binary string, it will not recognize it as binary string anymore. Instead, it is now of data type string.
p_loc = input("Which platform do you need?")
doc_existing = docx.Document(r"xxx")
text = []
for i in doc_existing.paragraphs:
text.append(i.text)
for pos,i in enumerate(text):
if i == p_loc:
len_pos = len(text[pos+1])
p_code = text[pos+1][2:len_pos-1] # get the binary string which is of type ordinary string
print(p_code.encode("utf8")) # when I apply .encode , another \ is added so I have for my binary code two \\
salt_data = "AbCdEfkhl"
plain = decrypt(salt_data, p_code)
print(plain)
p_code without .encode statement (as a string, not bytestring!): sc\x00\x02X\xd8\x8ez\xbfB\x03s\xc5\x8bm\xecp\x19\x8d\xd6lqW\xf1\xc3\xa4y\x8f\x1aW)\x9bX\xfc\x0e\xa4\xf2ngJj/]{\x80\x06-\x07\x8cQ\xeef\x0b\x02?\x86\x19\x98\x94eW\x08}\x1d8\xdb\xe57\xf7\x97\x81\xb6\xc7\x08\n^\xc9\xc0
When I now print out p_code.encode("utf8") I get the following result: b'sc\\x00\\x02X\\xd8\\x8ez\\xbfB\\x03s\\xc5\\x8bm\\xecp\\x19\\x8d\\xd6lqW\\xf1\\xc3\\xa4y\\x8f\\x1aW)\\x9bX\\xfc\\x0e\\xa4\\xf2ngJj/]{\\x80\\x06-\\x07\\x8cQ\\xeef\\x0b\\x02?\\x86\\x19\\x98\\x94eW\\x08}\\x1d8\\xdb\\xe57\\xf7\\x97\\x81\\xb6\\xc7\\x08\\n^\\xc9\\xc0'
So the problem is, if you compare this second binary string with the original binary string, that it added a second \ to it. As a consequence, I am not able to decode this binary string as it won t recognize it as the original binary code string.
So my question is: Is there a casual way to simply transform a string which is already in binary style back into binary string so it is the same? Or is there a way I could remove the second \ so that I have the original binary string again?
I am very grateful for any help!!
Ok. So when you do f"{ciphertext}"
you are telling python to store the string representation of those bytes, as text, in the doc.
E.g.
>>> b = b"\x00\x01\x65\x66"
>>> print(f"{b}")
b'\x00\x01ef'
You (probably) don't really want to store b'\x00\x01ef'
in your word doc. A good general way to store binary data in text form is to use a different encoding. Base64 is a commonly used encoding that is intended to store binary data in a text-based form.
See https://docs.python.org/3/library/base64.html for more information.
In your case, you do something like
import base64
cipher_b64_b = base64.b64encode(ciphertext)
cipher_b64 = cipher_b64_b.decode() # cipher_b64 is now a string.
# Now store this cipher_b64 string in your word document
...
# Now you fetch p_code (which is now a base64 string) from your word doc
cipher_b64_b = p_code.encode()
cipher = base64.b64decode(cipher_b64_b)
This results in your original binary ciphertext. The word document will contain a base64 encoded string like "AAFlZg==", which avoids the issues with escape sequences etc in your word document.