I've got a program processing results in batches of 100. It makes api request (one item each) and then batch processes the results.
Recently, the api I'm calling has developed scaling problems and, say 10% of the requests fail/timeout at any given time.
My problem is that I'm trying to process the ones that succeeded and I'm getting new AggregateExceptions - but only one in the InnerExceptions. It seems like I'm touching some kind of timebomb member in the Task array and I can't figure out what it is.
The outer loop sets up for the batch call and has the last-chance error handling:
try
{
GetBatchStats();
}
catch (AggregateException e)
{
var firstE = e.InnerExceptions[0];
_logger.Fatal($"{e.InnerExceptions.Count.ToString()} api requests threw exceptions. Sample exception: {firstE.Message}\nStackTrace: {firstE.StackTrace}");
}
My original loop processing the batch had:
int asyncExceptions = 0;
try
{
Task.WaitAll(batchRequests);
}
catch(AggregateException e)
{
asyncExceptions = e.InnerExceptions.Count;
var firstE = e.InnerExceptions[0];
_logger.Error($"{asyncExceptions.ToString()} api requests out of {batchRequests.Length.ToString()} threw exceptions. Sample exception: {firstE.Message}\nStackTrace: {firstE.StackTrace}");
}
if (asyncExceptions == batchRequests.Length)
{ // All requests in this batch errored out; api must be down
throw new Exception("All api requests in batch returned errors. Must be a problem with the api.");
}
int countErrs = 0;
foreach (var apiResult in batchRequests)
{
if (apiResult.Exception != null || apiResult.Result == null) countErrs++;
if (apiResult.Exception == null && apiResult.Result != null)
{ ...
}
}
I would get the error message in the log about "X api requests out of Y threw exceptions", followed immediately by a fatal message about "1 api requests threw exceptions" indicating that
I tried replacing the Task.Exception references with other ways of phrasing it, but I got the same result:
int asyncExceptions = 0;
try
{
Task.WaitAll(batchRequests);
}
catch(AggregateException e)
{
asyncExceptions = e.InnerExceptions.Count;
var firstE = e.InnerExceptions[0];
_logger.Error($"{asyncExceptions.ToString()} api requests out of {batchRequests.Length.ToString()} threw exceptions. Sample exception: {firstE.Message}\nStackTrace: {firstE.StackTrace}");
}
if (asyncExceptions == batchRequests.Length)
{ // All requests in this batch errored out; api must be down
throw new Exception("All api requests in batch returned errors. Must be a problem with the api.");
}
int countErrs = 0;
foreach (var apiResult in batchRequests)
{
if (apiResult.Status == TaskStatus.Faulted || apiResult.Result == null) countErrs++;
if (apiResult.Status == TaskStatus.RanToCompletion && apiResult.Result != null)
{ ...
}
}
Hopefully another set of eyes might spot what I'm missing... How do you process just the tasks in a group that had successful results and skip the ones that errored out?
Thanks
Well, after a lot of googling, it appears the answer is here:
https://learn.microsoft.com/en-us/dotnet/standard/parallel-programming/task-cancellation
When a task in a group is canceled (e.g. HttpClient having a timeout),