Search code examples
c#concurrencytasksemaphore.net-4.6.2

Running .NET tasks in parallel using SemaphoreSlim sometimes do not run?


I am setting up separate .NET (C#) tasks to run in parallel in an ASP.NET project (.NET 4.6.2) and want to limit the number of them that can run at once to 10. Most examples I have seen to accomplish this make use of the SemaphoreSlim and I have set up my logic the same as most of those examples:

            // Limit the concurrent number of threads to 10
            var throttler = new SemaphoreSlim(10);

            // Create a list of tasks to run
            var tasks = images.Select(async image =>
            {
                _logger.Info("WAITING FOR THROTTLER");
                await throttler.WaitAsync();
                try
                {
                    _logger.Info("ABOUT TO DOWNLOAD IMAGE FROM API A");
                    var imageData = await DownloadImage(...);

                    if (imageData != null)
                    {
                        _logger.Info("ABOUT TO UPLOAD IMAGE TO API B");
                        await AddFileAsync(...);
                    }
                }
                catch (Exception e)
                {
                    _logger.Error("Failed on UploadImages: " + e.Message);
                }
                finally
                {
                    throttler.Release();// Always release the semaphore when done
                }
            });

            await Task.WhenAll(tasks);// Now we actually run the tasks

For context, this code runs as fire-and-forget and is triggered as part of a web request. Each task downloads image data from an API (A) and then uploads that data to a different API (B) via HttpClient.

Testing this with 12 images, sometimes it works fine. But, there is an intermittent issue where sometimes 2 images never make it through, no error is thrown or returned from the API calls. I placed the logs in to narrow it down and found that after the first 10 process, the last 2 just never get passed the log: "WAITING FOR THROTTLER" so it seems to be stuck on waiting for the semaphore to release?

Output logs from one of these issue instances:

2023-03-15 18:33:17,049 [41] INFO MyService  WAITING FOR THROTTLER
2023-03-15 18:33:17,049 [41] INFO MyService  ABOUT TO DOWNLOAD IMAGE FROM API A
2023-03-15 18:33:17,050 [41] INFO MyService  WAITING FOR THROTTLER
2023-03-15 18:33:17,050 [41] INFO MyService  ABOUT TO DOWNLOAD IMAGE FROM API A
2023-03-15 18:33:17,051 [41] INFO MyService  WAITING FOR THROTTLER
2023-03-15 18:33:17,051 [41] INFO MyService  ABOUT TO DOWNLOAD IMAGE FROM API A
2023-03-15 18:33:17,053 [41] INFO MyService  WAITING FOR THROTTLER
2023-03-15 18:33:17,053 [41] INFO MyService  ABOUT TO DOWNLOAD IMAGE FROM API A
2023-03-15 18:33:17,054 [41] INFO MyService  WAITING FOR THROTTLER
2023-03-15 18:33:17,054 [41] INFO MyService  ABOUT TO DOWNLOAD IMAGE FROM API A
2023-03-15 18:33:17,055 [41] INFO MyService  WAITING FOR THROTTLER
2023-03-15 18:33:17,055 [41] INFO MyService  ABOUT TO DOWNLOAD IMAGE FROM API A
2023-03-15 18:33:17,056 [41] INFO MyService  WAITING FOR THROTTLER
2023-03-15 18:33:17,056 [41] INFO MyService  ABOUT TO DOWNLOAD IMAGE FROM API A
2023-03-15 18:33:17,057 [41] INFO MyService  WAITING FOR THROTTLER
2023-03-15 18:33:17,058 [41] INFO MyService  ABOUT TO DOWNLOAD IMAGE FROM API A
2023-03-15 18:33:17,059 [41] INFO MyService  WAITING FOR THROTTLER
2023-03-15 18:33:17,059 [41] INFO MyService  ABOUT TO DOWNLOAD IMAGE FROM API A
2023-03-15 18:33:17,059 [41] INFO MyService  WAITING FOR THROTTLER
2023-03-15 18:33:17,059 [41] INFO MyService  ABOUT TO DOWNLOAD IMAGE FROM API A
SEE HERE TWO LINES BELOW
2023-03-15 18:33:17,061 [41] INFO MyService  WAITING FOR THROTTLER
2023-03-15 18:33:17,061 [41] INFO MyService  WAITING FOR THROTTLER
SEE HERE TWO LINES ABOVE
2023-03-15 18:33:17,996 [74] INFO MyService  ABOUT TO UPLOAD IMAGE TO API B
2023-03-15 18:33:18,231 [40] INFO MyService  ABOUT TO UPLOAD IMAGE TO API B
2023-03-15 18:33:18,241 [40] INFO MyService  ABOUT TO UPLOAD IMAGE TO API B
2023-03-15 18:33:18,247 [51] INFO MyService  ABOUT TO UPLOAD IMAGE TO API B
2023-03-15 18:33:18,253 [51] INFO MyService  ABOUT TO UPLOAD IMAGE TO API B
2023-03-15 18:33:18,259 [51] INFO MyService  ABOUT TO UPLOAD IMAGE TO API B
2023-03-15 18:33:18,265 [79] INFO MyService  ABOUT TO UPLOAD IMAGE TO API B
2023-03-15 18:33:18,271 [79] INFO MyService  ABOUT TO UPLOAD IMAGE TO API B
2023-03-15 18:33:18,277 [40] INFO MyService  ABOUT TO UPLOAD IMAGE TO API B
2023-03-15 18:33:18,309 [74] INFO MyService  ABOUT TO UPLOAD IMAGE TO API B

I have outlined the two lines in the logs that suggest to me the issue is related to the semaphore. There are 10 logs for calling "DOWNLOAD" and "UPLOAD", but all 12 show "WAITING FOR THROTTLER" So you can see that two of the tasks never reach the point of calling the API and are stuck waiting on the semaphore.

I wonder why this happens though especially when it works sometimes, seems to be some sort of timing issue?

*I'll also note that using Parallel.ForEachAsync is not an option as this project is running on .NET 4.6.2 and upgrading is not an option right now.


Solution

  • this code runs as fire-and-forget

    Found the problem. ^

    Fire-and-forget code is unreliable in the general case.

    no error is thrown or returned from the API calls.

    Also normal for fire-and-forget code.


    OK, so here's the problem(s) with fire-and-forget:

    • ASP.NET is aware of requests and responses. Anything outside of that (i.e., fire-and-forget) is code that exists outside of the knowledge of ASP.NET (by default).
    • Therefore, whenever ASP.NET exits (i.e., whenever you upgrade your code), this request-extrinsic work will just stop.
    • There's no logging AFAIK. There might be an exception; if so, it will be ignored.
    • There's no error code returned from API calls, of course, because this is request-extrinsic code so there is no response to hold the error code.

    These are the normal problems with fire-and-forget code. If you want to ensure the work gets done, build a proper distributed architecture. (Links are to my blog). It's some work, but that's what is necessary for reliable processing.

    In addition to the normal problems above, your fire-and-forget code has an additional problem:

    I found a number of suggestions on using .ConfigureAwait(false)

    What's likely happening is that your request code is just calling it like _ = MyFireAndForgetMethod(...);. The problem is that ASP.NET pre-Core has a request context, and await will capture that context and return on it (by default; the ConfigureAwait(false) overrides this behavior and tells it not to return on the request context). This is a problem because after the request is completed (i.e., after the response has been sent), that request context is no longer valid. Code often (but not always) fails when attempting to use a request context that is no longer valid. (Link is to my blog)

    To avoid this more completely, you can wrap your fire-and-forget in Task.Run: _ = Task.Run(() => MyFireAndForgetMethod(...));. Task.Run is generally considered an antipattern on ASP.NET, but you already have a way bigger antipattern with the fire-and-forget code, so adding a smaller antipattern to avoid this problem with the bigger antipattern is not a huge deal.

    Of course, you still have all the normal (and unavoidable) inherent problems with fire-and-forget to deal with. The only solution I could actually recommend is a distributed architecture.

    Note that this had nothing to do with tasks or concurrency or SemaphoreSlim; the problem is the result of fire-and-forget.