Search code examples
pythonpython-2.7proxytornado

Close connection with python tornado http client on each call


I am using tornado python to execute non blocking hundreds of calls to an external proxy service. The external proxy service requires me to use a new connection on every call.

I wrote the following:

config = {
        'proxy_host': 'host',
        'proxy_port': port
    }

httpclient.AsyncHTTPClient.configure("tornado.curl_httpclient.CurlAsyncHTTPClient")
for url in urls:
    client = httpclient.AsyncHTTPClient(force_instance=True)
    client.fetch(url, callback=callback, headers=headers, **config)
ioloop.IOLoop.instance().start()

   def handle_request()
       do_something()
       if io_counter == max_calls:
            ioloop.IOLoop.instance().stop()

However, the service claims that I am using the same connection.

How can I use a different connection on each call?


Solution

  • You want to prohibit libcurl to reuse connections. Quick answer is CURLOPT_FORBID_REUSE.

    Reproduction

    In a virtualenv do pip install proxy.py tornado. Open two terminal windows and active the virtualenv.

    test.py

    import pycurl
    from tornado import gen, ioloop, httpclient
    
    urls = ['https://httpbin.org/uuid', 'https://httpbin.org/uuid']
    proxy_config = {'proxy_host': '127.0.0.1', 'proxy_port': 8899}
    
    def prepare(curl):
        # curl.setopt(pycurl.FORBID_REUSE, 1)
        pass   
    
    @gen.coroutine
    def main():
        client = httpclient.AsyncHTTPClient()
        for url in urls:
            kwargs = {}
            kwargs.update(proxy_config)
            kwargs['prepare_curl_callback'] = prepare
            response = yield client.fetch(url, **kwargs)
            print(response.body)
    
    if __name__ == '__main__':
        httpclient.AsyncHTTPClient.configure(
            'tornado.curl_httpclient.CurlAsyncHTTPClient')
        ioloop.IOLoop.instance().run_sync(main)
    

    window #1

    $ python test.py
    

    window #2

    $ proxy.py --log-level DEBUG 2>&1 | grep -P "(Closing proxy|Proxying connection)"
    

    Window #2 two should output something like:

    2017-10-20 19:03:53,091 - DEBUG - pid:30914 - Proxying connection <socket._socketobject object at 0x7fc90f0fe0c0> at address ('127.0.0.1', 55202)
    2017-10-20 19:03:53,695 - DEBUG - pid:30914 - Closing proxy for connection <socket._socketobject object at 0x7fc90f0fe0c0> at address ('127.0.0.1', 55202)
    

    Solution

    Now uncomment curl.setopt(pycurl.FORBID_REUSE, 1) and you will see:

    2017-10-20 19:05:19,492 - DEBUG - pid:30931 - Proxying connection <socket._socketobject object at 0x7f66de66e0c0> at address ('127.0.0.1', 55214)
    2017-10-20 19:05:19,890 - DEBUG - pid:30931 - Closing proxy for connection <socket._socketobject object at 0x7f66de66e0c0> at address ('127.0.0.1', 55214)
    2017-10-20 19:05:19,893 - DEBUG - pid:30932 - Proxying connection <socket._socketobject object at 0x7f66de66e280> at address ('127.0.0.1', 55218)
    2017-10-20 19:05:20,279 - DEBUG - pid:30932 - Closing proxy for connection <socket._socketobject object at 0x7f66de66e280> at address ('127.0.0.1', 55218)