Search code examples
pythonsocketsproxyhttp-proxyburp

Weird Response's Body Of A Python Proxy Server Using `socket` Module


The Problem

I'm Working On A Basic Proxy Server With Python3 sockets. It Works, But Not As It Should Work. The Response Headers Are Fine, But The Body Is Not. As It's Shown Below, The Response Body Looks Weird "Just Bytes 'x1f\x8b\x08\x00\x00...etc' ", But When Redirecting It To The Browser, It Renders It Correctly.

The Received Response With socket

b"HTTP/1.1 200 OK
Vary: Accept-Encoding\r\n
Content-Encoding: gzip\r\n
Content-Length: 156\r\n
Keep-Alive: timeout=5, max=100\r\n
Connection: Keep-Alive\r\n
Content-Type: text/html\r\n\
r\n\
x1f\x8b\x08\x00\x00\x00\x00\x00\x00\x03-\x8e\xcb\x0e\x83 \x14D\xf7|\x05\xb2.\xd5e\xa3\xe8\xda?p\x8d@\x81\xf4\xea5p\xfb\xf0\xef\x8b\xc6\xd5$'\x939\x03h4\x04\xcc\xa4\x02-00\x15\x9c\xb6%(\x12\xb8\x81\x8d\x0e\x00o|\xc2\x04\xb6b\xaa\xbe\xb0\xaa\xaf\xda\x8cv\xe7\xb37\x08\x98z1\x836/Q\xb0\x8d\x1f\x9ei\x07\xd7\x8bE'\x1f\xd7V\xbf\t\xbbo\xb4\x14\xdaG\xd3l\xbf\xae\xd4\xc6\xc8T%\xa5\x8a\x8b\xe79\x99^\x84\xe7}[\xbd\x18\xa4<\x14e\xe4\x88Cq\x1a\xcf\x7f\x7f\x10\x07P@\xb0\x00\x00\x00"

The Response Code Is:

def receive(sock):
    sock.settimeout(3)
    data=b""
    try:
        while 1:
            rcvd=sock.recv(4096)
            if not rcvd:
                break
            data+=rcvd
    except:
        pass
    return data

Then I Used Burp Suite To Get The Same Response, And The Response's Body Was Normal.

The Same Response With Burp

HTTP/1.1 200 OK
Vary: Accept-Encoding
Content-Length: 176
Connection: close
Content-Type: text/html

localhost<html>
<head>
<title>
Hello, World!
</title>
</head>
<body bgcolor="black">
<div style="margin:auto;width:800px;">
Hi
</div>
</body>
</html>

This Problem Happens With localhost Sites If They Exist, But If the Requested Site Is Not Exist 404 not found The Response Is Normal And Has A Clear And Normal Body.

So, I Wanna Figure Out What Is The Issue And how To Fix It.


Solution

  • It Works, But Not As It Should Work. The Response Headers Are Fine, But The Body Is Not. As It's Shown Below, The Response Body Looks Weird "Just Bytes 'x1f\x8b\x08\x00\x00...etc' ",

    The body is perfectly fine, only you fail to understand that it is fine:

    b"HTTP/1.1 200 OK
    ...
    Content-Encoding: gzip\r\n
    ...
    x1f\x8b\x08\x00\x00\x0...
    

    As can be seen in the response header the content is compressed with gzip. To see the "real" body as you expect you need to decompress it with gzip. If you don't want this then don't send the Accept-Encoding: gzip ... or similar field in your request header since this explicitly indicates that you are willing to accept gzip compressed content.

    In general: HTTP is more complex then what you might think by looking at a few examples. There are other usually unexpected things apart from compression, like chunked transfer encoding and multiple requests and responses within the same TCP connection. Please study the HTTP standards for more information instead of just "assuming" - that's what these standards are actually for.