Search code examples
pythonhttpproxytunnel

Python HTTPS Proxy Tunnelling


I'm trying to make an HTTP proxy in python. So far I've got everything except HTTPS working, hence the next step is to implement the CONNECT method.

I'm slightly confused with the chain of events that need to occur when doing HTTPS tunnelling. From my understanding I should have this when connecting to google:

Broswer -> Proxy

CONNECT www.google.co.uk:443 HTTP/1.1\r\n\r\n

Then the proxy should establish a secure connection to google.co.uk, and confirm it by sending:

Proxy -> Browser

HTTP/1.1 200 Connection established\r\n\r\n

At this point I'd expect the browser to now go ahead with whatever it was going to do in the first place, however, I either get nothing, or get a string of bytes that I can't decode(). I've been reading anything and everything to do with ssl tunnelling, and I think I'm supposed to be forwarding any and all bytes from browser to server, as well as the other way around. However, when doing this, I get a:

HTTP/1.0 400 Bad Request\r\n...\r\n

Once I've sent the 200 code, what should I be doing next?

My code snippet for the connect method:

client = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
    
if headers["Method"] == "CONNECT":
    client = ssl.wrap_socket(client)
    
    try:
        client.connect(( headers["Host"], headers["Port"] ))
        reply = "HTTP/1.0 200 Connection established\r\n"
        reply += "Proxy-agent: Pyx\r\n"
        reply += "\r\n"
        browser.sendall( reply.encode() )
    except socket.error as err:
        print(err)
        break
    
    while True:
        now not sure

Solution

  • After finding this answer to a related question: HTTPS Proxy Implementation (SSLStream)

    I realised that the initial connection on port 443 of the target server (in this case google.co.uk) should NOT be encrypted. I therefore removed the

    client = ssl.wrap_socket(client)
    

    line to continue with a plain text tunnel rather than ssl. Once the

    HTTP/1.1 200 Connection established\r\n\r\n
    

    message is sent, the browser and end server will then form their own ssl connection through the proxy, and so the proxy doesn't need to do anything related to the actual https connection.

    The modified code (includes byte forwarding):

    # If we receive a CONNECT request
    if headers["Method"] == "CONNECT":
        # Connect to port 443
        try:
            # If successful, send 200 code response
            client.connect(( headers["Host"], headers["Port"] ))
            reply = "HTTP/1.0 200 Connection established\r\n"
            reply += "Proxy-agent: Pyx\r\n"
            reply += "\r\n"
            browser.sendall( reply.encode() )
        except socket.error as err:
            # If the connection could not be established, exit
            # Should properly handle the exit with http error code here
            print(err)
            break
        
        # Indiscriminately forward bytes
        browser.setblocking(0)
        client.setblocking(0)
        while True:
            try:
                request = browser.recv(1024)
                client.sendall( request )
            except socket.error as err:
                pass
            try:
                reply = client.recv(1024)
                browser.sendall( reply )
            except socket.error as err:
                pass
    

    References:

    HTTPS Proxy Implementation (SSLStream)

    https://datatracker.ietf.org/doc/html/draft-luotonen-ssl-tunneling-03

    http://www.ietf.org/rfc/rfc2817.txt