Search code examples
azuremicroservicesload-balancinghaproxyhigh-availability

Setting High Availability Infrastructure


[Problem Statement] We have a Tier 0 service which has haproxy LB and multiple back end server configured behind it. Currently, infrastructure is serving P99 with ~100 ms. Now, as per the 100% availability and 0 downtime. Sometimes we see, some of the back end servers misbehaves or goes out of LB and that moment all of landed requests on those back end servers gets timeout. So we looking to have configuration like that if any request on server takes more than 100ms then this same request can route to another back end server and we can achieve the ~100℅ no time outs.

[Disclaimer] I understand after a certain retires if still request timeout, then it will serve the timeouts to end consumer of our Tier - 0 service.

[Tech Stack] HAProxy Java Java MySQL Azure

Would appreciate to discuss on this problem as I searched a lot but didn't get any reference, the way I am thinking but yes this could be possible by other ways so that we can achieve the no downtime and under the defined SLA of service.

Thanks


Solution

  • The option redispatch directive sends a request to a different server. The retry-on directive states what type of errors to retry on. The retries directive states how many times to retry.

    option redispatch 1
    retry-on all-retryable-errors
    retries 3
    

    Plus, you'll want to test how to setup the timeouts for the following

    timeout connect 5000ms
    timeout client 50000ms
    timeout server 50000ms
    

    Make sure all requests are idempotent and have no side effects. Otherwise, you will end up causing a lot of problems for yourself.