azure microservices load-balancing haproxy high-availability

Setting High Availability Infrastructure

[Problem Statement] We have a Tier 0 service which has haproxy LB and multiple back end server configured behind it. Currently, infrastructure is serving P99 with ~100 ms. Now, as per the 100% availability and 0 downtime. Sometimes we see, some of the back end servers misbehaves or goes out of LB and that moment all of landed requests on those back end servers gets timeout. So we looking to have configuration like that if any request on server takes more than 100ms then this same request can route to another back end server and we can achieve the ~100℅ no time outs.

[Disclaimer] I understand after a certain retires if still request timeout, then it will serve the timeouts to end consumer of our Tier - 0 service.

[Tech Stack] HAProxy Java Java MySQL Azure

Would appreciate to discuss on this problem as I searched a lot but didn't get any reference, the way I am thinking but yes this could be possible by other ways so that we can achieve the no downtime and under the defined SLA of service.

Thanks

Solution

The option redispatch directive sends a request to a different server. The retry-on directive states what type of errors to retry on. The retries directive states how many times to retry.

option redispatch 1
retry-on all-retryable-errors
retries 3

Plus, you'll want to test how to setup the timeouts for the following

timeout connect 5000ms
timeout client 50000ms
timeout server 50000ms

Make sure all requests are idempotent and have no side effects. Otherwise, you will end up causing a lot of problems for yourself.