Search code examples

is there a deadlock in my simple loop code

I have a micro service with a job that needs to happen only if a different server is up. for a few weeks it works great, if the server was down, the micro service sleeps a bit without doing the job (as should) and if the server was up - the job was done. the server is never down for more then a few minutes (for sure! the server is highly monitored), so the job is skipped 2-3 times tops.

Today I entered my Docker Container and noticed in the logs that the job didn't even try to continue for a few weeks now (bad choice not to monitor I know), indicating, I assume that some kind of deadlock happened. I also assume that the problem is with my Exception handling, could use some advice I work alone.

def is_server_healthy():
    url = "url" #correct url for health check path
        res = requests.get(url)
    except Exception as ex:
        LOGGER.error(f"Can't health check!{ex}")

    return res

def init():
    while True:"Sleeping for {SLEEP_TIME} Minutes")

        res = is_server_healthy()

        if res.status_code == 200:
  "Server is: {res.text}")
  "Server is down... {res.status_code}")

(The names of the variables were changed to simplify the question)

The health check is simple enough - return "up" if up. anything else considered to be down, so unless status 200 and "up" came back I consider the server to be down.


  • In case your server is down you get a non-captured error:

    NameError: name 'res' is not defined

    Why? See:

    def is_server_healthy():
        url = "don't care"
            raise Exception()  # simulate fail
        except Exception as ex:
            print(f"Can't health check!{ex}")
        return res   ## name is not known ;o)
    res = is_server_healthy()
    if res.status_code == 200:   # here, next exception bound to happen
        my_api.DoJob()"Server is: {res.text}")
    else:"Server is down... {res.status_code}")

    Even if you declared the name, it would try to access some attribute thats not there:

    if res.status_code == 200:   # here - object has no attribute 'status_code'   
        my_api.DoJob()"Server is: {res.text}")
    else:"Server is down... {res.status_code}")

    would try to access a member thats simply not there => Exception, and process gone.

    You are probably better off using some system-specific way to call your script once every minute (Cron Jobs, Task Scheduler) then idling in a while True: with sleep.