I wanted to use Polly
re-try and circuit breaker with Ocelot
api gateway. I am trying to wrap policies with DelegatingHandler
, the circuit breaker
works, but re-try
not works.
Below code just throw the exception, but NO re-try happening. When I am calling the API 3 times, circuit opens.
"ExceptionsAllowedBeforeBreaking": 3,
.CircuitBreakerAsync(route.QosOptions.ExceptionsAllowedBeforeBreaking,
[HttpGet("RaiseException")]
public async Task<int> RaiseException()
{
await Task.Delay(1);
throw new Exception("Mock Exception");
}
Custom Handler:
public class PollyWithInternalServerErrorCircuitBreakingDelegatingHandler : DelegatingHandler
{
private readonly IOcelotLogger _logger;
private readonly Polly.Wrap.AsyncPolicyWrap<HttpResponseMessage> _circuitBreakerPolicies;
public PollyWithInternalServerErrorCircuitBreakingDelegatingHandler(DownstreamRoute route, IOcelotLoggerFactory loggerFactory)
{
_logger = loggerFactory.CreateLogger<PollyWithInternalServerErrorCircuitBreakingDelegatingHandler>();
var pollyQosProvider = new PollyQoSProvider(route, loggerFactory);
var retryPolicy = HttpPolicyExtensions.HandleTransientHttpError()
.OrResult(r => r.StatusCode == HttpStatusCode.NotFound)
.WaitAndRetryAsync(2, retryAttempt => TimeSpan.FromSeconds(Math.Pow(2, retryAttempt)));
var responsePolicy = Policy.HandleResult<HttpResponseMessage>(r => r.StatusCode == HttpStatusCode.InternalServerError)
.CircuitBreakerAsync(route.QosOptions.ExceptionsAllowedBeforeBreaking,
TimeSpan.FromMilliseconds(route.QosOptions.DurationOfBreak));
_circuitBreakerPolicies = Policy.WrapAsync(pollyQosProvider.CircuitBreaker.Policies)
.WrapAsync(retryPolicy).WrapAsync(responsePolicy);
}
protected override async Task<HttpResponseMessage> SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)
{
try
{
return await _circuitBreakerPolicies.ExecuteAsync(() => base.SendAsync(request, cancellationToken));
}
catch (BrokenCircuitException ex)
{
_logger.LogError($"Reached to allowed number of exceptions. Circuit is open", ex);
throw;
}
catch (HttpRequestException ex)
{
_logger.LogError($"Error in CircuitBreakingDelegatingHandler.SendAsync", ex);
throw;
}
}
}
Ocelot Builder Extensions:
public static class OcelotBuilderExtensions
{
public static IOcelotBuilder AddPollyWithInternalServerErrorHandling(this IOcelotBuilder builder)
{
var errorMapping = new Dictionary<Type, Func<Exception, Error>>
{
{typeof(TaskCanceledException), e => new RequestTimedOutError(e)},
{typeof(TimeoutRejectedException), e => new RequestTimedOutError(e)},
{typeof(BrokenCircuitException), e => new RequestTimedOutError(e)}
};
builder.Services.AddSingleton(errorMapping);
DelegatingHandler QosDelegatingHandlerDelegate(DownstreamRoute route, IOcelotLoggerFactory logger)
{
return new PollyWithInternalServerErrorCircuitBreakingDelegatingHandler(route, logger);
}
builder.Services.AddSingleton((QosDelegatingHandlerDelegate)QosDelegatingHandlerDelegate);
return builder;
}
}
Program.cs
var builder = WebApplication.CreateBuilder(args);
//Ocelot add it's configuration file
builder.Configuration.AddJsonFile($"ocelot.config.{builder.Environment.EnvironmentName}.json", optional: false, reloadOnChange: true);
builder.Services.AddOcelot(builder.Configuration)
.AddPollyWithInternalServerErrorHandling();
Ocelot Configuration
"UpstreamHttpMethod": [ "GET" ],
"QoSOptions": {
//Number of exceptions which are allowed before the circuit breaker is triggered.
"ExceptionsAllowedBeforeBreaking": 3,
//Duration in milliseconds for which the circuit breaker would remain open after been tripped
"DurationOfBreak": 5000,
//Duration after which the request is considered as timedout
"TimeoutValue": 100000
}
Even though the problem has been solved by removing Ocelot, let me share my thoughts about your policies.
The way how you chain the policies to each other defines an escalation order.
So, first let's see how does your policy chain looks like.
The PollyQoSProvider
helper class defines two policies in the following order:
HttpRequestException
, TimeoutRejectedException
and TimeoutException
These policies do not return any value.
You have defined two other policies:
HttpRequestException
or when the status code is either 404 or 408 or 5XXThese policies do return with an HttpResponseMessage
.
You have chained them in the following order (from the most outer to the most inner):
Circuit Breaker which triggers for
HttpRequestException
,TimeoutRejectedException
andTimeoutException
Timeout
Retry which triggers for
HttpRequestException
or when the status code is either 404 or 408 or 5XXCircuit Breaker which triggers when the status code is 500
This may or may not be your desired resiliency strategy. I would advice you to reassess whether this is what you really want/need.
The policies of the PollyQoSProvider
are defined for async methods (Task
) whereas yours are defined for async functions (Task<HttpResponseMessage>
). The static WrapAsync
does not allow to combine these two types of policies. On the hand the instance level WrapAsync
does. (For more information about this constraints please read this SO topic.)
Because you have used the combination of the two that's why the chained policy is an IAsyncPolicy<HttpResponseMessage>
. Even though it's working, I usually suggest to use only the static WrapAsync
to chain policies due to its compile-time compatibility guarantees.
I'm using Polly for awhile and I haven't encountered any use case where multiple (nested) circuit breakers would be really required. Most of time you can (and should) solve it with a single CB which can trigger for multiple different conditions:
Circuit Breaker which triggers when the status code is 500 or for the following exceptions:
HttpRequestException
,TimeoutRejectedException
andTimeoutException
Policy<HttpResponseMessage>
.HandleResult(r => r.StatusCode == HttpStatusCode.InternalServerError)
.Or<HttpRequestException>()
.Or<TimeoutRejectedException>()
.Or<TimeoutException>()
.CircuitBreakerAsync(
The CB was designed in a way that it can be shared between multiple components. If you have already detected that the downstream is temporarily inaccessible then use this information everywhere rather than issue new requests and come to the same conclusion.
So, defining a CB inside a DelegatingHandler
is against this. Each and every DelegatingHandler
will have its own CB so, they do not share state via the ICircuitController
. Aim for reusing CB policy.
Timeout can work in optimistic or in pessimistic mode. Even though your code looks like at first glance it uses optimistic, unfortunately it does not.
protected override async Task<HttpResponseMessage> SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)
{
try
{
return await _circuitBreakerPolicies.ExecuteAsync(() => base.SendAsync(request, cancellationToken));
}
...
}
The proper way would be the following by using a different overload of ExecuteAsync
:
await _circuitBreakerPolicies.ExecuteAsync((ct) => base.SendAsync(request,ct), cancellationToken);
OFF: The PollyQoSProvider
's CB was defined in a way that it can break for optimistic timeout (TimeoutRejectedException
) and pessimistic timeout (TimeoutException
) as well.