Search code examples
c#sql-serversql-server-2017unit-of-work.net-core-2.2

Why would I run out of SQL connections using the unit of work pattern?


When we put many messages on the bus, and they call a process in our business logic, this error is occurring:

The timeout period elapsed prior to obtaining a connection from the pool. This may have occurred because all pooled connections were in use and max pool size was reached.

This doesn't happen when, say, 15 messages call our process. However, it does happen when 80, or 130, processes are invoked.

We're using the unit of work pattern, and the connection is being closed after use. So I'm struggling to understand why it wouldn't be available in the pool for the next process.

This is how the unit of work is used in our app:

using (var uow = _uowFactory.Create(true))
{
    await uow.AccrualRepo.AddAccrualHistoriesAsync(histories);
    await uow.CommitAsync();
}

And this is how the factory returns a uow:

public class UnitOfWorkFactory : IUnitOfWorkFactory
{
    private readonly IConfiguration _configuration;
    private readonly IMediator _mediator;
    private readonly IStateAccessor _stateAccessor;
    private readonly ITimeProvider _timeProvider;
    private readonly IDbConnection _connection;
    private readonly IAccrualMapper _accrualMapper;
    private readonly ILogger<RepoBase> _logger;

    public UnitOfWorkFactory(IConfiguration configuration, IDbConnection sqlConnection, IMediator mediator,
        IStateAccessor stateAccessor, ITimeProvider timeProvider, IAccrualMapper accrualMapper, ILogger<RepoBase> logger)
    {
        _configuration = configuration;
        _mediator = mediator;
        _stateAccessor = stateAccessor;
        _timeProvider = timeProvider;
        _connection = sqlConnection;
        _accrualMapper = accrualMapper;
        _logger = logger;
    }

    public IUnitOfWork Create(bool useTransaction)
    {
        return new UnitOfWork(_configuration, _connection, _mediator, _stateAccessor, _timeProvider, _accrualMapper, _logger, useTransaction);
    }

Our Startup.cs file sets these for dependency injection this way:

services.AddTransient<IUnitOfWorkFactory, UnitOfWorkFactory>();
services.AddTransient<IDbConnection, SqlConnection>();

I now this is a lot of code, but our uow looks like this. Notice that the connection is closed after calling CommitAsync() and when disposing.

public class UnitOfWork : IUnitOfWork, IDisposable
{
    private readonly IConfiguration _configuration;
    private readonly IMediator _mediator;
    private readonly IStateAccessor _stateAccessor;
    private readonly ITimeProvider _timeProvider;
    private readonly IAccrualMapper _accrualMapper;
    private readonly ILogger<RepoBase> _logger;
    private IDbConnection _connection;
    private IDbTransaction _transaction;
    private IAccrualRepo _accrualRepo;
    private bool _disposed;
    private bool _commitOccurred;
    private bool _useTransaction;

    public UnitOfWork(IConfiguration configuration, IDbConnection sqlConnection, IMediator mediator, 
        IStateAccessor stateAccessor, ITimeProvider timeProvider, IAccrualMapper accrualMapper,
        ILogger<RepoBase> logger, bool useTransaction = true)
    {
        _configuration = configuration;
        _mediator = mediator;
        _stateAccessor = stateAccessor;
        _timeProvider = timeProvider;
        _useTransaction = useTransaction;
        _accrualMapper = accrualMapper;
        _logger = logger;
        _connection = sqlConnection;
        _connection.ConnectionString = _configuration["ConnectionString"];
        _connection.Open();

        if (useTransaction)
        {
            _transaction = _connection.BeginTransaction();
        }
    }

    public IAccrualRepo AccrualRepo
    {
        get => _accrualRepo ?? (_accrualRepo = new AccrualRepo(_configuration, _connection,
            _transaction, _stateAccessor, _timeProvider, _mediator, _logger));
        set => _accrualRepo = value;
    }

    public async Task CommitAsync()
    {
        if (!_useTransaction)
        {
            throw new InvalidOperationException("Attempting to call commit on a unit of work that isn't using a transaction");
        }

        try
        {
            _transaction.Commit();
            _commitOccurred = true;
            await InvokePostCommitOnReposAsync();
        }
        catch
        {
            _transaction.Rollback();
            throw;
        }
        finally
        {
            _connection.Close();
            _transaction.Dispose();
            ResetRepositories();
        }
    }

    private async Task InvokePostCommitOnReposAsync()
    {
        var repos = new List<RepoBase>();
        if (_accrualRepo != null) { repos.Add((RepoBase)_accrualRepo); }

        try
        {
            foreach (var repo in repos)
            {
                await repo.PostCommitAsync();
            }
        }
        catch (Exception ex)
        {
            _logger.LogError(ex, "Exception occurred while invoking post commit on a repo.");
        }
    }

    private void ResetRepositories()
    {
        _accrualRepo = null; // Note: there are more repos here, but removed for clarity.
    }

    public void Dispose()
    {
        Dispose(true);
        GC.SuppressFinalize(this); // Already disposed; no need for the GC to finalize.
    }

    protected virtual void Dispose(bool calledFromDisposeAndNotFromFinalizer)
    {
        if (_disposed) { return; }

        if (calledFromDisposeAndNotFromFinalizer)
        {
            // If the user never called commit, but we are using a transaction, then roll back.
            if (!_commitOccurred && _useTransaction && _transaction != null) { _transaction.Rollback(); }

            if (_transaction != null) { _transaction.Dispose(); _transaction = null; }
            if (_connection != null) { _connection.Dispose(); _connection = null; }
        }

        _disposed = true;
    }
}

So why would we have this connection pool issue? Is something being done incorrectly here? Maybe we need to increase the connection pool size?


Solution

  • Connections pool size is simultaneous connections allowed. For example, the default is 100 for SQL Server. If the number of connections at one time goes above that, they have to wait for previous connections to close.

    If you have a lot of messages that can come in at once, I recommend increasing the connection pool size.

    https://learn.microsoft.com/en-us/dotnet/framework/data/adonet/sql-server-connection-pooling

    However, if you notice you are getting this message after a certain amount of run-time. It's probably because there is a problem with the code and some connections aren't being closed.