Currently if you deploy a new version of docker image using mesos-marathon framework, the containers having old images will get a SIGTERM and after 3 seconds they get a SIGKILL and are terminated immediately.
If we integrate the framework with marathon-lb(haproxy wrapper), these containers continue to be in rotation(haproxy continues to send traffic to it) untill the next health check is triggered(it happens at a internal configurable by haproxy). So all the requests going to these containers during that interval will end up getting 5XX. So is there a workaround to take the containers out of rotation from marathon-lb before SIGKILL.
Even if you set the health check interval to 3 seconds, graceful deployment cannot be guaranteed as there can be race condition between the next health check and the 3 seconds after which marathon sends SIGKILL to the container and setting health check interval to 1 second is just not possible when the number of backend nodes increases. Is there any other way of achieving this?
There are 3 options that could work for you.
executor_shutdown_grace_period
or decrease (as you mentioned it's not an option for you) haproxy healtcheck interval to be two times shorter than graceful period.