With an ELB setup, there as healthcheck timeout, e.g. take a server out of the LB if it fails X fail checks.
For a real zero down time deployment, I actually want to be able to avoid these extra 4-5 seconds of down time.
Is there a simple way to do that on the ops side, or does this needs to be in the level of the web server itself?
If you're doing continuous deployment you should deregister the instance you're deploying to from ELB (say, aws elb deregister-instances-from-load-balancer
), wait for the current connections to drain, deploy you app and then register an instance with ELB.
http://docs.aws.amazon.com/cli/latest/reference/elb/deregister-instances-from-load-balancer.html http://docs.aws.amazon.com/cli/latest/reference/elb/register-instances-with-load-balancer.html
It is also a common strategy to deploy to another AutoScaling Group, then just switch ASG on the load balancer.