Search code examples
pythonloggingpython-requestspython-logging

Log all requests from the python-requests module


I am using python Requests. I need to debug some OAuth activity, and for that I would like it to log all requests being performed. I could get this information with ngrep, but unfortunately it is not possible to grep https connections (which are needed for OAuth)

How can I activate logging of all URLs (+ parameters) that Requests is accessing?


Solution

  • The underlying urllib3 library logs all new connections and URLs with the logging module, but not POST bodies. For GET requests this should be enough:

    import logging
    
    logging.basicConfig(level=logging.DEBUG)
    

    which gives you the most verbose logging option; see the logging HOWTO for more details on how to configure logging levels and destinations.

    Short demo:

    >>> import requests
    >>> import logging
    >>> logging.basicConfig(level=logging.DEBUG)
    >>> r = requests.get('http://httpbin.org/get?foo=bar&baz=python')
    DEBUG:urllib3.connectionpool:Starting new HTTP connection (1): httpbin.org:80
    DEBUG:urllib3.connectionpool:http://httpbin.org:80 "GET /get?foo=bar&baz=python HTTP/1.1" 200 366
    

    Depending on the exact version of urllib3, the following messages are logged:

    • INFO: Redirects
    • WARN: Connection pool full (if this happens often increase the connection pool size)
    • WARN: Failed to parse headers (response headers with invalid format)
    • WARN: Retrying the connection
    • WARN: Certificate did not match expected hostname
    • WARN: Received response with both Content-Length and Transfer-Encoding, when processing a chunked response
    • DEBUG: New connections (HTTP or HTTPS)
    • DEBUG: Dropped connections
    • DEBUG: Connection details: method, path, HTTP version, status code and response length
    • DEBUG: Retry count increments

    This doesn't include headers or bodies. urllib3 uses the http.client.HTTPConnection class to do the grunt-work, but that class doesn't support logging, it can normally only be configured to print to stdout. However, you can rig it to send all debug information to logging instead by introducing an alternative print name into that module:

    import logging
    import http.client
    
    httpclient_logger = logging.getLogger("http.client")
    
    def httpclient_logging_patch(level=logging.DEBUG):
        """Enable HTTPConnection debug logging to the logging framework"""
    
        def httpclient_log(*args):
            httpclient_logger.log(level, " ".join(args))
    
        # mask the print() built-in in the http.client module to use
        # logging instead
        http.client.print = httpclient_log
        # enable debugging
        http.client.HTTPConnection.debuglevel = 1
    

    Calling httpclient_logging_patch() causes http.client connections to output all debug information to a standard logger, and so are picked up by logging.basicConfig():

    >>> httpclient_logging_patch()
    >>> r = requests.get('http://httpbin.org/get?foo=bar&baz=python')
    DEBUG:urllib3.connectionpool:Starting new HTTP connection (1): httpbin.org:80
    DEBUG:http.client:send: b'GET /get?foo=bar&baz=python HTTP/1.1\r\nHost: httpbin.org\r\nUser-Agent: python-requests/2.22.0\r\nAccept-Encoding: gzip, deflate\r\nAccept: */*\r\nConnection: keep-alive\r\n\r\n'
    DEBUG:http.client:reply: 'HTTP/1.1 200 OK\r\n'
    DEBUG:http.client:header: Date: Tue, 04 Feb 2020 13:36:53 GMT
    DEBUG:http.client:header: Content-Type: application/json
    DEBUG:http.client:header: Content-Length: 366
    DEBUG:http.client:header: Connection: keep-alive
    DEBUG:http.client:header: Server: gunicorn/19.9.0
    DEBUG:http.client:header: Access-Control-Allow-Origin: *
    DEBUG:http.client:header: Access-Control-Allow-Credentials: true
    DEBUG:urllib3.connectionpool:http://httpbin.org:80 "GET /get?foo=bar&baz=python HTTP/1.1" 200 366