Search code examples
istioenvoyproxycircuit-breaker

Circuit breaker pattern in Envoy proxy


I have two services A and B, launching requests to a third service C this way: A-->C and B-->C

C is configured with Istio with a destination-rule that has the circuit breaker pattern (outlier detection) configured. When C responds with a consecutive series of 5xx errors the circuit breaker is opened, and from this moment a 503 Service unavailable is received.

In the calls to C I am using a query param to indicate if I want to simulate an error 500 or not.

I thought that once the circuit is open, it opens for all calls regardless of the origin, however when B calls C with the flag to simulate error 500, it starts receiving a 503 while A keeps receiving a 200 OK !!! If I now configure that A calls C with the flag to simulate error 500, it starts receiving a 503. It seems that opening the circuit depends on who makes the call, is this the expected behavior?


Solution

  • Yes.

    Take a look at my other answer on Stack that has related information. I checked in logs where exactly the circuit breaker is tripped.

    In the log entries, you can inspect them to figure out both end of the connection that was stopped by the circuit breaker. IP addresses of both sides of the connection are present in the log message from the istio-proxy container.

    The message is coming from istio-proxy container which runs Envoy that was affected by CircuitBreaker policy that request was sent to. Also there is the IP address of both the source and destination of the connection that was interrupted.

    According to this article istio circuit breaker has three states:

    A circuit breaker can have three states: closed, open and half open, and by default exists in a closed state. In the closed state, requests succeed or fail until the number of failures reach a predetermined threshold, with no interference from the breaker. When the threshold is reached, the circuit breaker opens. When calling a service in an open state, the circuit breaker trips the requests, which means that it returns an error without attempting to execute the call. In this way, by tripping the request downstream at the client, cascading failures can be prevented in a production system. After a configurable timeout, the circuit breaker enters a half open state, in which the failing service is given time to recover from its broken behavior. If requests continue to fail in this state, then the circuit breaker is opened again and keeps tripping requests. Otherwise, if the requests succeed in the half open state, then the circuit breaker will close and the service will be allowed to handle requests again.

    Taking this into consideration testing circuit breaker can be difficult as you may accidentally trip it when in half open state and depending on ejection time which increases each time the service fails to recover.

    I suggest reading the whole article i mentioned as it has the most detailed explanation of circuit breaker i could find on internet.