c#.net dotnet-httpclient polly retry-logic

Retry policy for Http client

How can i improve this code so that the retry applies to all the exceptions which needs retry by default

public static readonly AsyncPolicy RetryPolicy = Policy
   .Handle<Exception>()
   .RetryAsync(3);

Solution

Exceptions

The SendAsync operation can throw the following exceptions:

ArgumentNullException when the request was null
InvalidOperationException when the request message was already sent by the HttpClient instance
HttpRequestException when the request failed due to an underlying issue such as network connectivity, DNS failure, server certificate validation or timeout
TaskCanceledException when the request failed due to timeout (.NET Core and .NET 5 and later only)
OperationCanceledException when the cancellation token was canceled (This exception is stored into the returned task)

Retrying on the first two exceptions will not change anything on the outcome. It will still fail with the exact same exception. These exceptions are indicating problem on the caller side.

The TaskCanceledException is derived from the OperationCanceledException. So, if you want to retry in both cases then it is enough to handle only the latter one.

public static readonly AsyncPolicy RetryPolicy = Policy
   .Handle<HttpRequestException>()
   .Or<OperationCanceledException>()
   .RetryAsync(3);

Status Codes

The above section covered only those cases when the SendAsync failed with an exception. But there might be cases where the SendAsync did not fail but the sent request were not processed. The SendAsync returns an HttpResponseMessage object and its StatusCode property could indicate retryable problems. Typical retryable status codes:

408 Request Timeout
429 Too Many Requests
(500 Internal Server Error)
502 Bad Gateway
503 Service Unavailable
504 Gateway Timeout

The internal server error usually treated as retryable status code but it really depends on the downstream system's implementation. You might have called an endpoint which has a bug. Re-issuing the same request most probably will not result in a different status code than 500.

If you look at the remarks of the AddTransientHttpErrorPolicy then you can see that it will register a retry policy which triggers for one of the following conditions:

Network failures (as HttpRequestException)
HTTP 5XX status codes (server errors)
HTTP 408 status code (request timeout)

This could be a good starting point but you might extend the triggers with one of the above mentioned exceptions and/or status codes.

Idempotent operation

Please bear in mind that not all operations are retryable. Requests which are creating new resources (usually POST requests in REST) might result in unwanted duplicates. If the downstream system is not prepared for de-duplication then you should try to avoid retries.

Here I have detailed the considerations for retry.