Search code examples
pythonpython-3.xzlib

zlib TypeError: a bytes-like object is required, not 'str'


I use this code to encode and compress text. But it doesn't work properly:

Traceback (most recent call last): File "E:\SOUND.py", line 114, in <module>
unhexsring = str(zlib.decompress(unhexsring).encode('utf8'))
TypeError: a bytes-like object is required, not 'str' 

Can you help me?

import zlib,gzip

def str2hex(s):
    return binascii.hexlify(bytes(str.encode(s)))


def hex2str(h):
    return binascii.unhexlify(h)

hexstring = input()
if len(hexstring) > 200:
    hexstring = str(zlib.compress(hexstring.encode('utf-8')))
    print(hexstring)
hexstring = str2hex(hexstring)
ph = str(hexstring.decode('utf-8'))
print(ph)

#decompressing text
unhexsring = hex2str(hexstring).decode('utf8')
if 'x' in str(unhexsring):
    print('compressed')
    unhexsring = str(zlib.decompress(unhexsring).encode('utf8'))
print(unhexsring)

This code will not decompress the zlib-compressed text.

So encoding work good.

My trouble is when I get encoded string and compress it I can't decompress it. How should it works:

1>s = input('some text')
2>if len(s) > 200: s = str(zlib.compress(s.encode('utf-8'))) 
3>encoding it with str2hex()
4>decode it with hex2str()
5>str(zlib.decompress(unhexs).encode('utf8'))  <---------- HERE

And I can't decompress it properly because getting this:

CONSOLE DUMP NEXT

Python 3.7.0 (v3.7.0:1bf9cc5093, Jun 27 2018, 04:06:47) [MSC v.1914 32 bit (Intel)] on win32
Type "copyright", "credits" or "license()" for more information.
>>> 
========================= RESTART: E:\SOUND.py =========================
dghlkdushfgkjdsfhglkjhsdfgjhdskfjhgkdsfhgkjdhfgkjsdhfgjkhsdkjfghlkjsdhgkjhsdfjghdksjhgkjsdhgkjhsdfkjghdskfjghkdjghdghlkdushfgkjdsfhglkjhsdfgjhdskfjhgkdsfhgkjdhfgkjsdhfgjkhsdkjfghlkjsdhgkjhsdfjghdksjhgkjsdhgkjhsdfkjghdskfjghkdjghdghlkdushfgkjdsfhglkjhsdfgjhdskfjhgkdsfhgkjdhfgkjsdhfgjkhsdkjfghlkjsdhgkjhsdfjghdksjhgkjsdhgkjhsdfkjghdskfjghkdjghdghlkdushfgkjdsfhglkjhsdfgjhdskfjhgkdsfhgkjdhfgkjsdhfgjkhsdkjfghlkjsdhgkjhsdfjghdksjhgkjsdhgkjhsdfkjghdskfjghkdjgh
b'x\x9c\xed\x8d\xb1\r\xc0@\x08\x03\x97\xb5\xb0e\x7f\x87\xb2\x7f\x9eO\x93\x05\xd2\xa5\x02\x1d>\x0cj\x05W\xab\x18\xa3K\\\xb1\x1aE\x0b\x9d\xb2\x98\x83\xf7\xf5dz\x86\xb3#q\x8d<\x84\x8fc\n\xe9Q^0C\xe7\x13\x15\xcc\xfe7~\xd0x\x03\x88\x05\xbb\x9d'
6227785c7839635c7865645c7838645c7862315c725c786330405c7830385c7830335c7839375c7862355c786230655c7837665c7838375c7862325c7837665c7839654f5c7839335c7830355c7864325c7861355c7830325c7831643e5c7830636a5c783035575c7861625c7831385c7861334b5c5c5c7862315c783161455c7830625c7839645c7862325c7839385c7838335c7866375c786635647a5c7838365c78623323715c7838643c5c7838345c783866635c6e5c786539515e30435c7865375c7831335c7831355c7863635c786665377e5c786430785c7830335c7838385c7830355c7862625c78396427
compressed
Traceback (most recent call last):
  File "E:\SOUND.py", line 114, in <module>
    unhexsring = str(zlib.decompress(unhexsring).encode('utf8'))
TypeError: a bytes-like object is required, not 'str'

Solution

  • The exception you see here:

    unhexsring = str(zlib.decompress(unhexsring).encode('utf8'))
        TypeError: a bytes-like object is required, not 'str'
    

    is raised because zlib.decompress expects bytes. This is easily fixed by changing

    unhexsring = hex2str(hexstring).decode('utf8')    # -> str
    

    to

    unhexsring = hex2str(hexstring)    # -> bytes
    

    However this results in a new error:

    unhexsring = zlib.decompress(unhexsring)
        zlib.error: Error -3 while decompressing data: incorrect header check
    

    This one is happening because of this line:

    hexstring = str(zlib.compress(hexstring.encode('utf-8')))
    

    Calling str on a bytes instance doesn't convert the bytes to str, it converts the bytes' repr to str.

    >>> bs = 'Hello World'.encode('utf-8')
    >>> print(repr(bs))
    b'Hello World'
    >>> s = str(bs)
    >>> print(repr(s))
    "b'Hello World'"    # <- note the b....
    

    The str conversion is inserting a 'b' at the front of the compressed data and so corrupting the header. Let's leave hexstring as a bytes object for now

    hexstring = zlib.compress(hexstring.encode('utf-8'))
    

    But now the code raises yet another exception:

    return binascii.hexlify(bytes(str.encode(s)))
        TypeError: descriptor 'encode' requires a 'str' object but received a 'bytes'
    

    s is now a bytes object, so there's no need try to convert it (and note that str.encode returns bytes anyway, so the bytes call would be redundant even if s were a string).

    So str2hex becomes

    def str2hex(s):
        return binascii.hexlify(s)
    

    Now yet another error is raised:

    unhexsring = str(zlib.decompress(unhexsring).encode('utf8'))
        AttributeError: 'bytes' object has no attribute 'encode'
    

    The output of zlib.decompress is a bytes object, so it's already encoded (assuming it was a string to begin with). You want to decode it to get a str:

    unhexsring = zlib.decompress(unhexsring).decode('utf8')
    

    This is a version of your code that can be run as a script from the command prompt:

    import binascii
    import random
    import string
    import zlib
    
    
    def str2hex(s):
        return binascii.hexlify(s)
    
    
    def hex2str(h):
        return binascii.unhexlify(h)
    
    
    def main():
        # I don't want to type 200+ chars to test this :-)
        hexstring = ''.join(random.choices(string.ascii_letters, k=201))
        hexstring = hexstring.encode('utf-8')
        if len(hexstring) > 200:
            hexstring = zlib.compress(hexstring)
        print(f'{hexstring=}')
        hexstring = str2hex(hexstring)
        decoded_hexstring = hexstring.decode('utf-8')
        print(f'{decoded_hexstring=}')
    
        # decompressing text
        unhexstring = hex2str(hexstring)
    
        # decompressing text
        unhexstring = hex2str(hexstring)
        # Checking for 'x' in the string isn't a good way to check for # compression. For
        # small data we can just try to see if it can be decompressed. For large data
        # we could inspect the # first byte - see
        # https://stackoverflow.com/q/9050260/5320906
        try:
            unhexstring = zlib.decompress(unhexstring)
            print('compressed')
        except zlib.error:
            # Not compressed, do nothing.
            pass
        print(f'{unhexstring=}')
    
    
    if __name__ == '__main__':
        main()