Search code examples
c#entity-framework-corestack-overflow

Getting a stack overflow with too many concat


When I create a query on a DbSet with too many Concat or Where clauses, I get a stack overflow error.

Essentially I have the problem where I have a list of thousands of AND clauses all connected with OR clauses. It would look a little something like:

(A AND B) OR (C AND D) OR ...

The clauses are created from a list so the number of AND clauses that are concatenated by the OR clauses is dynamic and could be from 0 to thousands.

I tried creating selects for each AND clause and using Concat to combine multiple selects together using Entity Framework, but I get a stack overflow exception.

I feel like there should be a better way to write the code, but I'm not sure so I've included the error and some example code in a hope someone knows how this should be done without reverting back to writing inline SQL (Goes against the entity framework paradigm)

The exact error is as follows:

Stack overflow.
Repeat 798 times:
--------------------------------
   at Microsoft.EntityFrameworkCore.Query.Internal.ExpressionTreeFuncletizer.Visit[[System.__Canon, System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]](System.Collections.ObjectModel.ReadOnlyCollection`1<System.__Canon>, System.Func`2<System.__Canon,System.__Canon>, StateType ByRef, State[] ByRef, Boolean)
   at Microsoft.EntityFrameworkCore.Query.Internal.ExpressionTreeFuncletizer.Visit(System.Collections.ObjectModel.ReadOnlyCollection`1<System.Linq.Expressions.Expression>, StateType ByRef, State[] ByRef, Boolean)
   at Microsoft.EntityFrameworkCore.Query.Internal.ExpressionTreeFuncletizer.VisitMethodCall(System.Linq.Expressions.MethodCallExpression)
   at Microsoft.EntityFrameworkCore.Query.Internal.ExpressionTreeFuncletizer.Visit(System.Linq.Expressions.Expression)
--------------------------------
   at Microsoft.EntityFrameworkCore.Query.Internal.ExpressionTreeFuncletizer.Visit[[System.__Canon, System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]](System.Collections.ObjectModel.ReadOnlyCollection`1<System.__Canon>, System.Func`2<System.__Canon,System.__Canon>, StateType ByRef, State[] ByRef, Boolean)
   at Microsoft.EntityFrameworkCore.Query.Internal.ExpressionTreeFuncletizer.Visit(System.Collections.ObjectModel.ReadOnlyCollection`1<System.Linq.Expressions.Expression>, StateType ByRef, State[] ByRef, Boolean)

Libraries:

  • EFCore.BulkExtensions.PostgreSql Version="8.1.2"
  • Microsoft.EntityFrameworkCore.Design Version="9.0.0"
  • Npgsql.EntityFrameworkCore.PostgreSQL Version="9.0.2"
  • Npgsql.EntityFrameworkCore.PostgreSQL.Design Version="1.1.0"

Here's an example how to reproduce the issue:

private class SimpleDbContext : DbContext
{
    // Stores the values 0, 1, 2, ..., 100000
    public virtual DbSet<SequencePoint> SequencePoints { get; set; }
}

private class SequencePoint
{
    public int SequenceNumber { get; set; }
}
    
private void ConcatErrorTest()
{
    SimpleDbContext simpleDbContext = new();
        
    List<Tuple<int, int>> selectRanges = new(); // 0, 10, 11, 20, 21, 21, etc...
        
    for (int i = 0; i < 7500; i++)
    {
        int startRange = i * 10;
        int endRange = startRange + (i % 5);
        selectRanges.Add(new Tuple<int, int>(startRange, endRange));
    }

    IQueryable<SequencePoint> queryable = null;

    foreach (Tuple<int,int> selectRange in selectRanges)
    {
        IQueryable<SequencePoint> whereQueryable = simpleDbContext.SequencePoints.AsQueryable().Where(point =>
            (point.SequenceNumber >= selectRange.Item1) &&
            (point.SequenceNumber <= selectRange.Item2)
        );
            
        queryable = queryable == null ? whereQueryable : queryable.Concat(whereQueryable);
    }

    // Throws Stack overflow.
    List<int> result = queryable.Select(sequenceNumber => sequenceNumber.SequenceNumber).ToList();
        
    _logger.LogInformation("result = {result}", result);
}

Solution

  • Thanks for the responses.

    In my case the example problem here was a simplification of my actual problem but the responses helped me get on the right track.

    The most entity framework way of approaching this problem was to have the ranges in another table and use a multi select. Note a Join would not work here as you can't do that with a LINQ joins - LINQ only supports equijoins.

    Here's what the code would look like:

    private class SimpleDbContext : DbContext
    {
        // Stores the values 0, 1, 2, ..., 100000
        public virtual DbSet<SequencePoint> SequencePoints { get; set; }
        
        // [0, 0], [10, 11], [20, 21, 22], etc...
        public virtual DbSet<SelectionRange> SelectionRanges { get; set; }
    }
    
    private class SequencePoint
    {
        public int SequenceNumber { get; set; }
    }
    
    private class SelectionRange
    {
        public int LowerRange { get; set; }
        public int UpperRange { get; set; }
    }
    
    private void ConcatReworked()
    {
        SimpleDbContext simpleDbContext = new();
        
        IQueryable<SequencePoint> queryable = from sequencePoint in simpleDbContext.SequencePoints
            from selectionRange in simpleDbContext.SelectionRanges
            where sequencePoint.SequenceNumber >= selectionRange.LowerRange
            where sequencePoint.SequenceNumber <= selectionRange.UpperRange
            orderby sequencePoint.SequenceNumber
            select sequencePoint;
    
        List<int> result = queryable.Select(sequenceNumber => sequenceNumber.SequenceNumber).ToList();
        
        _logger.LogInformation("result = {result}", result);
    }
    

    The other approach I tried was to write the Query manually using a StringBuilder which looked something along this lines of:

    private async Task LoadSelection(List<SelectionRange> selectionRange)
    {
        using NpgsqlConnection connection = await npgsqlDataSource.OpenConnectionAsync();
        StringBuilder commandSqlBuilder = new();
        commandSqlBuilder.Append("SELECT ");
        // ... More SQL here
        for (int i = 0; i < selectionRange.Count; i++)
        {
            SelectionRange range = selectionRange[i];
            commandSqlBuilder.Append("OR (");
            commandSqlBuilder.Append($"value >= (@range_lower_param_{i})");
            commandSqlBuilder.Append($"value <= (@range_upper_param_{i})");
            // ... More SQL here
        }
        // ... More SQL here
        NpgsqlCommand command = new NpgsqlCommand(commandSqlBuilder.ToString(), connection);
        command.Parameters.AddWithValue(...);
        NpgsqlDataReader reader = await command.ExecuteReaderAsync();
        while (await reader.ReadAsync())
    }
    

    While this does work, and runs about twice as fast as the join, it does have new limitations such as A statement cannot have more than 65535 parameters. So I needed to split it into smaller statements and join the results in code later. Also when the database names change in a migration I will need to come back an update this String Builder to reflect the changes.