Search code examples
pythonsocketshttpgetbzip

Sending gzip compressed data through TCP socket in Python


I'm creating an HTTP server in Python without any of the HTTP libraries for learning purposes. Right now it can serve static files fine.

The way I serve the file is through this piece of code:

with open(self.filename, 'rb') as f:
    src = f.read()
socket.sendall(src)

However, I want to optimize its performance a bit by sending compressed data instead of uncompressed. I know that my browser (Chrome) accepts compressed data because it tells me in the header

Accept-Encoding: gzip, deflate, sdch

So, I changed my code to this

with open(self.filename, 'rb') as f:
    src = zlib.compress(f.read())
socket.sendall(src)

But this just outputs garbage. What am I doing wrong?


Solution

  • The zlib library implements the deflate compression algorithm (RFC 1951). There are two encapsulations for the deflate compression: zlib (RFC 1950) and gzip (RFC 1952). These differ only in the kind of header and trailer they provide around the deflate compressed data.

    zlib.compress only provides the raw deflate data, without any header and trailer. To get these you need to use a compression object. For gzip this looks like this:

    z = zlib.compressobj(-1,zlib.DEFLATED,31)
    gzip_compressed_data = z.compress(data) + z.flush()
    

    The important part here is the 31 as the 3rd argument to compressobj. This specifies gzip format which then can be used with Content-Encoding: gzip.