Search code examples
pythonproxytwisted

Python - Twisted, Proxy and modifying content


So i've looked around at a few things involving writting an HTTP Proxy using python and the Twisted framework.

Essentially, like some other questions, I'd like to be able to modify the data that will be sent back to the browser. That is, the browser requests a resource and the proxy will fetch it. Before the resource is returned to the browser, i'd like to be able to modify ANY (HTTP headers AND content) content.

This ( Need help writing a twisted proxy ) was what I initially found. I tried it out, but it didn't work for me. I also found this ( Python Twisted proxy - how to intercept packets ) which i thought would work, however I can only see the HTTP requests from the browser.

I am looking for any advice. Some thoughts I have are to use the ProxyClient and ProxyRequest classes and override the functions, but I read that the Proxy class itself is a combination of the both.

For those who may ask to see some code, it should be noted that I have worked with only the above two examples. Any help is great.

Thanks.


Solution

  • To create ProxyFactory that can modify server response headers, content you could override ProxyClient.handle*() methods:

    from twisted.python import log
    from twisted.web import http, proxy
    
    class ProxyClient(proxy.ProxyClient):
        """Mangle returned header, content here.
    
        Use `self.father` methods to modify request directly.
        """
        def handleHeader(self, key, value):
            # change response header here
            log.msg("Header: %s: %s" % (key, value))
            proxy.ProxyClient.handleHeader(self, key, value)
    
        def handleResponsePart(self, buffer):
            # change response part here
            log.msg("Content: %s" % (buffer[:50],))
            # make all content upper case
            proxy.ProxyClient.handleResponsePart(self, buffer.upper())
    
    class ProxyClientFactory(proxy.ProxyClientFactory):
        protocol = ProxyClient
    
    class ProxyRequest(proxy.ProxyRequest):
        protocols = dict(http=ProxyClientFactory)
    
    class Proxy(proxy.Proxy):
        requestFactory = ProxyRequest
    
    class ProxyFactory(http.HTTPFactory):
        protocol = Proxy
    

    I've got this solution by looking at the source of twisted.web.proxy. I don't know how idiomatic it is.

    To run it as a script or via twistd, add at the end:

    portstr = "tcp:8080:interface=localhost" # serve on localhost:8080
    
    if __name__ == '__main__': # $ python proxy_modify_request.py
        import sys
        from twisted.internet import endpoints, reactor
    
        def shutdown(reason, reactor, stopping=[]):
            """Stop the reactor."""
            if stopping: return
            stopping.append(True)
            if reason:
                log.msg(reason.value)
            reactor.callWhenRunning(reactor.stop)
    
        log.startLogging(sys.stdout)
        endpoint = endpoints.serverFromString(reactor, portstr)
        d = endpoint.listen(ProxyFactory())
        d.addErrback(shutdown, reactor)
        reactor.run()
    else: # $ twistd -ny proxy_modify_request.py
        from twisted.application import service, strports
    
        application = service.Application("proxy_modify_request")
        strports.service(portstr, ProxyFactory()).setServiceParent(application)
    

    Usage

    $ twistd -ny proxy_modify_request.py
    

    In another terminal:

    $ curl -x localhost:8080 http://example.com