I have two services A and B, launching requests to a third service C this way: A-->C and B-->C
C is configured with Istio with a destination-rule that has the circuit breaker pattern (outlier detection) configured. When C responds with a consecutive series of 5xx errors the circuit breaker is opened, and from this moment a 503 Service unavailable is received.
In the calls to C I am using a query param to indicate if I want to simulate an error 500 or not.
I thought that once the circuit is open, it opens for all calls regardless of the origin, however when B calls C with the flag to simulate error 500, it starts receiving a 503 while A keeps receiving a 200 OK !!! If I now configure that A calls C with the flag to simulate error 500, it starts receiving a 503. It seems that opening the circuit depends on who makes the call, is this the expected behavior?
Yes.
Take a look at my other answer on Stack that has related information. I checked in logs where exactly the circuit breaker is tripped.
In the log entries, you can inspect them to figure out both end of the connection that was stopped by the circuit breaker. IP addresses of both sides of the connection are present in the log message from the
istio-proxy
container.
The message is coming from
istio-proxy
container which runs Envoy that was affected byCircuitBreaker
policy that request was sent to. Also there is the IP address of both the source and destination of the connection that was interrupted.
According to this article istio circuit breaker has three states:
A circuit breaker can have three states:
closed
,open
andhalf open
, and by default exists in aclosed
state. In theclosed
state, requests succeed or fail until the number of failures reach a predetermined threshold, with no interference from the breaker. When the threshold is reached, the circuit breaker opens. When calling a service in anopen
state, the circuit breaker trips the requests, which means that it returns an error without attempting to execute the call. In this way, by tripping the request downstream at the client, cascading failures can be prevented in a production system. After a configurable timeout, the circuit breaker enters ahalf open
state, in which the failing service is given time to recover from its broken behavior. If requests continue to fail in this state, then the circuit breaker is opened again and keeps tripping requests. Otherwise, if the requests succeed in thehalf open
state, then the circuit breaker will close and the service will be allowed to handle requests again.
Taking this into consideration testing circuit breaker can be difficult as you may accidentally trip it when in half open
state and depending on ejection time which increases each time the service fails to recover.
I suggest reading the whole article i mentioned as it has the most detailed explanation of circuit breaker i could find on internet.