Search code examples
deploymenthaproxymesosmarathongraceful-degradation

How to gracefully do a rolling deployment in mesos-marathon


Currently if you deploy a new version of docker image using mesos-marathon framework, the containers having old images will get a SIGTERM and after 3 seconds they get a SIGKILL and are terminated immediately.

If we integrate the framework with marathon-lb(haproxy wrapper), these containers continue to be in rotation(haproxy continues to send traffic to it) untill the next health check is triggered(it happens at a internal configurable by haproxy). So all the requests going to these containers during that interval will end up getting 5XX. So is there a workaround to take the containers out of rotation from marathon-lb before SIGKILL.

Even if you set the health check interval to 3 seconds, graceful deployment cannot be guaranteed as there can be race condition between the next health check and the 3 seconds after which marathon sends SIGKILL to the container and setting health check interval to 1 second is just not possible when the number of backend nodes increases. Is there any other way of achieving this?


Solution

  • There are 3 options that could work for you.

    1. As @Tobi proposed take a look at blue green deployment with Marathon.
    2. You can increase executor_shutdown_grace_period or decrease (as you mentioned it's not an option for you) haproxy healtcheck interval to be two times shorter than graceful period.
    3. A little hacky but your application could deregister from haproxy itself when got SIGTERM.