Search code examples
c#multithreadingdeadlocksemaphore

SemaphoreSlim deadlocks on 9th thread


I have a test automation environment with multithreaded tests that use a shared HttpClient to test methods on our Web API. After the HttpClient has been initialized, it can be used by all of our tests running on multiple threads, since it is a thread safe object. However, keeping the initialization from happening more than once is a challenge. Furthermore, it includes the await keyword in it, so it cannot use any basic lock technology to ensure the initialization operation is atomic.

To make sure the initialization happens properly, I am using a SemaphoreSlim to create a mutex for initialization. To get access to the object, all tests have to call a function that uses the SemaphoreSlim to make sure it has been properly initialized by the first thread to request it.

I found the following implementation for using SemaphoreSlim on this web page.

public class TimedLock
{
    private readonly SemaphoreSlim toLock;

    public TimedLock()
    {
        toLock = new SemaphoreSlim(1, 1);
    }

    public LockReleaser Lock(TimeSpan timeout)
    {
        if (toLock.Wait(timeout))
        {
            return new LockReleaser(toLock);
        }

        throw new TimeoutException();
    }

    public struct LockReleaser : IDisposable
    {
        private readonly SemaphoreSlim toRelease;

        public LockReleaser(SemaphoreSlim toRelease)
        {
            this.toRelease = toRelease;
        }
        public void Dispose()
        {
            toRelease.Release();
        }
    }
}

I use this class like so:

private static HttpClient _client;
private static TimedLock _timedLock = new();

protected async Task<HttpClient> GetClient()
{
    using (_timedLock.Lock(TimeSpan.FromSeconds(600)))
    {
        if (_client != null)
        {
            return _client;
        }

        MyWebApplicationFactory<Startup> factory = new();
        _client = factory.CreateClient();

        Request myRequest = new Request()
        {
            //.....Authentication code
        };

        HttpResponseMessage result = await _client.PostAsJsonAsync("api/accounts/Authenticate", myRequest);
        result.EnsureSuccessStatusCode();
        AuthenticateResponse Response = await result.Content.ReadAsAsync<AuthenticateResponse>();

        _client.DefaultRequestHeaders.Authorization = new AuthenticationHeaderValue("Bearer", Response.Token);
        return _client;
    }
}

It worked flawlessly until just recently, when I added a ninth thread to my code. I've had to dial that back to 8 threads, because whenever I allow a 9th thread to call the TimedLock.Lock method, the entire program deadlocks.

Does anyone know what might be going on, or how to work around this problem?


Solution

  • OK. I figured out my own problem, and it really was MY own problem, nobody else's.

    If you compare my code above really closely to the source that I quoted as getting it from, you'll notice there's actually a difference. The original code implements the Lock function asynchronously using the WaitAsync function built into SemaphoreSlim:

    public async Task<LockReleaser> Lock(TimeSpan timeout)
    {
        if(await toLock.WaitAsync(timeout))
        {
            return new LockReleaser(toLock);
        }
        throw new TimeoutException();
    }
    

    And of course, in my code that uses it, add the await keyword at the proper place to take care of the added Task object properly:

    ...
    using (await _timedLock.Lock(TimeSpan.FromSeconds(6000)))
    {
    ...
    

    Yesterday I "discovered" that if I changed the toLock object to use WaitAsync, the problem magically went away, and I was SO proud of myself. But then just a few minutes ago, when I was copying and pasting the "original" code into my question, I realized that the "original" code actually included "my" fix!

    I now remember looking at this a few months ago and wondering why they needed to make this an Async function. So in my superior wisdom, I tried it without the Async, saw that it worked fine, and continued on until I just recently started using enough threads to demonstrate why it is necessary!

    So to keep from confusing people, I changed the code in my question to be the bad code that I originally changed it to be, and put the truly original good code above here in "my" answer, which actually should be credited to Richard Blewett, the author of the referenced article.

    I can't say I completely understand why this fix actually works, so any further answers that can explain it better are more than welcome!