I use this code to encode and compress text. But it doesn't work properly:
Traceback (most recent call last): File "E:\SOUND.py", line 114, in <module>
unhexsring = str(zlib.decompress(unhexsring).encode('utf8'))
TypeError: a bytes-like object is required, not 'str'
Can you help me?
import zlib,gzip
def str2hex(s):
return binascii.hexlify(bytes(str.encode(s)))
def hex2str(h):
return binascii.unhexlify(h)
hexstring = input()
if len(hexstring) > 200:
hexstring = str(zlib.compress(hexstring.encode('utf-8')))
print(hexstring)
hexstring = str2hex(hexstring)
ph = str(hexstring.decode('utf-8'))
print(ph)
#decompressing text
unhexsring = hex2str(hexstring).decode('utf8')
if 'x' in str(unhexsring):
print('compressed')
unhexsring = str(zlib.decompress(unhexsring).encode('utf8'))
print(unhexsring)
This code will not decompress the zlib-compressed text.
So encoding work good.
My trouble is when I get encoded string and compress it I can't decompress it. How should it works:
1>s = input('some text')
2>if len(s) > 200: s = str(zlib.compress(s.encode('utf-8')))
3>encoding it with str2hex()
4>decode it with hex2str()
5>str(zlib.decompress(unhexs).encode('utf8')) <---------- HERE
And I can't decompress it properly because getting this:
CONSOLE DUMP NEXT
Python 3.7.0 (v3.7.0:1bf9cc5093, Jun 27 2018, 04:06:47) [MSC v.1914 32 bit (Intel)] on win32
Type "copyright", "credits" or "license()" for more information.
>>>
========================= RESTART: E:\SOUND.py =========================
dghlkdushfgkjdsfhglkjhsdfgjhdskfjhgkdsfhgkjdhfgkjsdhfgjkhsdkjfghlkjsdhgkjhsdfjghdksjhgkjsdhgkjhsdfkjghdskfjghkdjghdghlkdushfgkjdsfhglkjhsdfgjhdskfjhgkdsfhgkjdhfgkjsdhfgjkhsdkjfghlkjsdhgkjhsdfjghdksjhgkjsdhgkjhsdfkjghdskfjghkdjghdghlkdushfgkjdsfhglkjhsdfgjhdskfjhgkdsfhgkjdhfgkjsdhfgjkhsdkjfghlkjsdhgkjhsdfjghdksjhgkjsdhgkjhsdfkjghdskfjghkdjghdghlkdushfgkjdsfhglkjhsdfgjhdskfjhgkdsfhgkjdhfgkjsdhfgjkhsdkjfghlkjsdhgkjhsdfjghdksjhgkjsdhgkjhsdfkjghdskfjghkdjgh
b'x\x9c\xed\x8d\xb1\r\xc0@\x08\x03\x97\xb5\xb0e\x7f\x87\xb2\x7f\x9eO\x93\x05\xd2\xa5\x02\x1d>\x0cj\x05W\xab\x18\xa3K\\\xb1\x1aE\x0b\x9d\xb2\x98\x83\xf7\xf5dz\x86\xb3#q\x8d<\x84\x8fc\n\xe9Q^0C\xe7\x13\x15\xcc\xfe7~\xd0x\x03\x88\x05\xbb\x9d'
6227785c7839635c7865645c7838645c7862315c725c786330405c7830385c7830335c7839375c7862355c786230655c7837665c7838375c7862325c7837665c7839654f5c7839335c7830355c7864325c7861355c7830325c7831643e5c7830636a5c783035575c7861625c7831385c7861334b5c5c5c7862315c783161455c7830625c7839645c7862325c7839385c7838335c7866375c786635647a5c7838365c78623323715c7838643c5c7838345c783866635c6e5c786539515e30435c7865375c7831335c7831355c7863635c786665377e5c786430785c7830335c7838385c7830355c7862625c78396427
compressed
Traceback (most recent call last):
File "E:\SOUND.py", line 114, in <module>
unhexsring = str(zlib.decompress(unhexsring).encode('utf8'))
TypeError: a bytes-like object is required, not 'str'
The exception you see here:
unhexsring = str(zlib.decompress(unhexsring).encode('utf8'))
TypeError: a bytes-like object is required, not 'str'
is raised because zlib.decompress
expects bytes
. This is easily fixed by changing
unhexsring = hex2str(hexstring).decode('utf8') # -> str
to
unhexsring = hex2str(hexstring) # -> bytes
However this results in a new error:
unhexsring = zlib.decompress(unhexsring)
zlib.error: Error -3 while decompressing data: incorrect header check
This one is happening because of this line:
hexstring = str(zlib.compress(hexstring.encode('utf-8')))
Calling str
on a bytes
instance doesn't convert the bytes
to str
, it converts the bytes' repr
to str
.
>>> bs = 'Hello World'.encode('utf-8')
>>> print(repr(bs))
b'Hello World'
>>> s = str(bs)
>>> print(repr(s))
"b'Hello World'" # <- note the b....
The str
conversion is inserting a 'b' at the front of the compressed data and so corrupting the header. Let's leave hexstring as a bytes object for now
hexstring = zlib.compress(hexstring.encode('utf-8'))
But now the code raises yet another exception:
return binascii.hexlify(bytes(str.encode(s)))
TypeError: descriptor 'encode' requires a 'str' object but received a 'bytes'
s
is now a bytes
object, so there's no need try to convert it (and note that str.encode
returns bytes
anyway, so the bytes
call would be redundant even if s
were a string).
So str2hex
becomes
def str2hex(s):
return binascii.hexlify(s)
Now yet another error is raised:
unhexsring = str(zlib.decompress(unhexsring).encode('utf8'))
AttributeError: 'bytes' object has no attribute 'encode'
The output of zlib.decompress
is a bytes
object, so it's already encoded (assuming it was a string to begin with). You want to decode it to get a str
:
unhexsring = zlib.decompress(unhexsring).decode('utf8')
This is a version of your code that can be run as a script from the command prompt:
import binascii
import random
import string
import zlib
def str2hex(s):
return binascii.hexlify(s)
def hex2str(h):
return binascii.unhexlify(h)
def main():
# I don't want to type 200+ chars to test this :-)
hexstring = ''.join(random.choices(string.ascii_letters, k=201))
hexstring = hexstring.encode('utf-8')
if len(hexstring) > 200:
hexstring = zlib.compress(hexstring)
print(f'{hexstring=}')
hexstring = str2hex(hexstring)
decoded_hexstring = hexstring.decode('utf-8')
print(f'{decoded_hexstring=}')
# decompressing text
unhexstring = hex2str(hexstring)
# decompressing text
unhexstring = hex2str(hexstring)
# Checking for 'x' in the string isn't a good way to check for # compression. For
# small data we can just try to see if it can be decompressed. For large data
# we could inspect the # first byte - see
# https://stackoverflow.com/q/9050260/5320906
try:
unhexstring = zlib.decompress(unhexstring)
print('compressed')
except zlib.error:
# Not compressed, do nothing.
pass
print(f'{unhexstring=}')
if __name__ == '__main__':
main()