Search code examples
pythonpython-requestscroncron-task

Python requests/urllib3 NewConnectionError only when script runs on a cronjob from the office


Weird issue I can't figure out.

I have a script that uses Python's request library and is run on a cronjob. When I'm at home over a VPN it works fine.

If I'm at the office the cronjob returns a connection error, more specifically a NewConnectionError[error 60: connection timeout] (which is raised by urllib3). The weird thing is if I run the script manually from the command line it doesn't have a problem.

I only have a high level understanding of how requests/urllib3/cron works. I'm guessing the connection is cached in some way but I'm not sure. Does anyone know what could be causing this?

The script itself is a sync utility that creates a connection to bitbucket's api. I created an api-wrapper to achieve this, which is essentially just an object to build queries with. Here is a snippet from the wrapper:

def __init__(self, username, password):
    s = requests.Session()
    s.auth = (username, password)
    self._bitbucket_session = s

def _get_context(self, url, paging):
    try:
        r = self._bitbucket_session.get(url)
        if r.status_code == 403:
            raise self.BitbucketAPIError('BitbucketAPIError: {}'.format(r.reason))
        if 'error' in r.json():
            raise self.BitbucketAPIError('BitbucketAPIError: {}'.format(r.json()['error']['message']))
    except HTTPError as e:
        print("HTTP Error: {}".format(e))
    except ConnectTimeout as e:
        print("The request timed out while trying to connect to the remote server: {}".format(e))
    except ConnectionError as e:
        print("Connection Error: {}".format(e))
    except Timeout as e:
        print("Connection Timed out: {}".format(e))
    except RequestException as e:
        print("Unhandled exception: {}".format(e))

And here is a simplified version of the sync client that is being "croned":

bapi = BitbucketApi(username, password)
# blah blah blah 
update_members()
update_repository()

bapi.close()

Here is the close method:

def close(self):
    self._bitbucket_session.close()

Solution

  • Probably there is a proxy involved.

    When the script is run from your home there is no proxy, or the proxy is properly configured, so there is no problem.

    When run from the command line at your office the shell environment is properly configured to set a HTTP/S proxy via environment variables:

    export http_proxy="http://proxy.com:3128"
    export https_proxy="https://proxy.com:3128"
    

    (upper case variables are also effective, i.e. HTTP_PROXY, HTTPS_PROXY)

    However, when the script is run from cron the environment does not have the proxy variables set, and the connection request times out. You create a wrapper for the script and then execute the wrapper script from cron. e.g.

    #!/bin/sh
    export HTTP_PROXY="http://proxy:1234"
    export HTTPS_PROXY="https://proxy:1234"
    python your_script.py