Search code examples
pythonhttpconnectionhttplib

HTTPConnection.request not respecting timeout?


I'm trying to use HTTPConnection (2.7.8) to make a request and I've set the timeout to 10 with HTTPConnection(host, timeout=10). However, HTTPConnection.request() doesn't seem to timeout after 10 seconds. In fact, HTTPConnection.timeout doesn't even seem to be read by HTTPConnection.request() (it's only read by HTTPConnection.connect(). Is my understanding correct? Is timeout only applicable to connect() and not request()? Is there a way to timeout request()?

Update:

I think I've narrowed the issue down further: if I don't provide the scheme, it won't respect the socket timeout. If the scheme was provided, i.e. the full URL being http://google.com:22222, then it'd time out accordingly. I wonder why the presence of the scheme should make a difference. That is, the following doesn't respect the timeout

    socket.setdefaulttimeout(3)
    conn = HTTPConnection('google.com:22222')
    conn.timeout = 3
    conn.request('GET', '')

whereas, this does:

    socket.setdefaulttimeout(3)
    conn = HTTPConnection('http://google.com:22222')
    conn.timeout = 3
    conn.request('GET', '')

However, it doesn't happen to all domains.

Thanks


Solution

  • It takes around ~30 seconds for the following code to fail:

    #!/usr/bin/env python2
    from httplib import HTTPConnection
    
    conn = HTTPConnection('google.com', 22222, timeout=2)
    conn.request('GET', '')
    

    If ip is passed to HTTPConnection instead of the hostname then the timeout error is raised in 2 seconds as expected:

    #!/usr/bin/env python2
    import socket
    from httplib import HTTPConnection
    
    host, port = 'google.com', 22222
    ip, port = socket.getaddrinfo(host, port)[0][-1]
    conn = HTTPConnection(ip, port, timeout=2)
    conn.request('GET', '')
    

    The explanation is the same as in ftplib.FTP timeout has inconsistent behaviour: the timeout may limit individual socket operations but it says nothing about the duration of the HTTPConnection() call itself that may try several ip addresses returned by getaddrinfo() and the timeout limits only the individual socket operations. Several operations combined may take longer.

    Your HTTPConnection('http://google.com:22222') fails sooner because the url is an incorrect argument: it should be either host or host:port. The absolute url is accepted by request() method -- though even there it has special meaning -- typically, you just provide the path along such as '/'.