I have a problem with AggregateException thrown from .NET library. For some reason it is not caught in try-catch block. It causes whole application to crash and there are no methods from my code in the call stack. It happens when I test a VPN disconnection scenario. So I basically start an app when connected to VPN and when app is running I disconnect from VPN and then the exception is thrown and app crashes, as it's not caught.
Repository GetBatchAsync()
method is called inside Polly retryPolicy
that is inside Parallel.ForAsync()
.
PS. It is not a problem with Tools > Options (or Debug > Options) > Debugging > General > Enable Just My Code.
The code:
await Parallel.ForAsync(0, totalBatchesCount, parallelOptions, async (i, cancellationToken) =>
{
//...
await retryPolicy.ExecuteAsync(async () =>
{
try
{
var batch = await _userRepository.GetBatchAsync(startId, endId);
//...
}
catch (Exception ex)
{
throw new Exception($"Batch failed for startId: {startId} and endId {endId}", ex);
}
});
});
Repository method:
public async Task<List<User>> GetBatchAsync(int startId, int endId)
{
try
{
await using var context = await _contextFactory.CreateDbContextAsync();
return await context.Users
.Where(u => u.UserId >= startId && u.UserId < endId)
.Include(u => u.Invoice)
.AsNoTracking()
.ToListAsync();
}
catch (AggregateException ex)
{
Console.WriteLine(ex.ToString());
throw;
}
catch (Exception ex)
{
Console.WriteLine(ex.ToString());
throw;
}
}
Retry policy:
var retryPolicy = Policy
.Handle<Exception>()
.WaitAndRetryAsync(POLLY_RETRY_COUNT, retryAttempt => TimeSpan.FromSeconds(Math.Pow(2, retryAttempt)),
(exception, timeSpan, retryCount, context) =>
{
_logger.LogWarning($"Retry {retryCount} due to: {exception.Message}, stack trace: {exception.StackTrace}, inner: {exception?.InnerException?.Message}");
});
The exception:
Unhandled exception. System.AggregateException: One or more errors occurred. (Timeout expired. The timeout period elapsed prior to completion of the operation or the server is not responding)
---> MySql.Data.MySqlClient.MySqlException (0x80004005): Timeout expired. The timeout period elapsed prior to completion of the operation or the server is not responding
---> System.TimeoutException: The operation has timed out.
at MySql.Data.Common.StreamCreator.<>c.<GetTcpStreamAsync>b__8_1()
at System.Threading.CancellationTokenSource.Invoke(Delegate d, Object state, CancellationTokenSource source)
at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state)
--- End of stack trace from previous location ---
at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state)
at System.Threading.CancellationTokenSource.ExecuteCallbackHandlers(Boolean throwOnFirstException)
--- End of inner exception stack trace ---
at System.Threading.CancellationTokenSource.ExecuteCallbackHandlers(Boolean throwOnFirstException)
at System.Threading.TimerQueueTimer.Fire(Boolean isThreadPool)
at System.Threading.TimerQueue.FireNextTimers()
at System.Threading.ThreadPoolWorkQueue.Dispatch()
at System.Threading.PortableThreadPool.WorkerThread.WorkerThreadStart()
Looking at that stack trace, it looks like someone registered a CancellationToken
callback (with CancellationToken.Register
), then the token was cancelled when the timeout elapsed, which called the callback, and the callback threw an exception. You can see that the call stack goes directly to the thread pool, via various methods on CancellationTokenSource
which are responsible for calling callbacks registered on CancellationToken
.
This means that the exception bubbles back up to the timer which cancelled the CancellationToken
, and then back up to the ThreadPool, which causes your application to terminate. There's no way for you to catch that exception.
Indeed, if we go and look at the source, we find:
if (execAsync)
using (cancellationToken.Register(() => throw new MySqlException(Resources.Timeout, new TimeoutException())))
await tcpClient.ConnectAsync(settings.Server, (int)settings.Port).ConfigureAwait(false);
else
if (!tcpClient.ConnectAsync(settings.Server, (int)settings.Port).Wait((int)settings.ConnectionTimeout * 1000))
throw new MySqlException(Resources.Timeout, new TimeoutException());
That's some seriously questionable code, written by someone who has no idea what they are doing. Indeed there's a bug report about it from 2023, which is still unaddressed.
It looks like the code path where execAsync
is false
is more sensible. I'm not sure how to trigger that, but in lieu of a fix from upstream, I'd recommend trying to hit that code path instead. Yes it's blocking, but you can wrap it in a Task.Run
(not ideal, but better than an unrecoverable crash).
Alternatively, this issue recommends using https://www.nuget.org/packages/MySqlConnector. I have no experience with it, but it isn't written by Oracle which is a good start!