When I create a query on a DbSet
with too many Concat
or Where
clauses, I get a stack overflow error.
Essentially I have the problem where I have a list of thousands of AND
clauses all connected with OR
clauses. It would look a little something like:
(A AND B) OR (C AND D) OR ...
The clauses are created from a list so the number of AND
clauses that are concatenated by the OR
clauses is dynamic and could be from 0 to thousands.
I tried creating selects for each AND
clause and using Concat
to combine multiple selects together using Entity Framework, but I get a stack overflow exception.
I feel like there should be a better way to write the code, but I'm not sure so I've included the error and some example code in a hope someone knows how this should be done without reverting back to writing inline SQL (Goes against the entity framework paradigm)
The exact error is as follows:
Stack overflow.
Repeat 798 times:
--------------------------------
at Microsoft.EntityFrameworkCore.Query.Internal.ExpressionTreeFuncletizer.Visit[[System.__Canon, System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]](System.Collections.ObjectModel.ReadOnlyCollection`1<System.__Canon>, System.Func`2<System.__Canon,System.__Canon>, StateType ByRef, State[] ByRef, Boolean)
at Microsoft.EntityFrameworkCore.Query.Internal.ExpressionTreeFuncletizer.Visit(System.Collections.ObjectModel.ReadOnlyCollection`1<System.Linq.Expressions.Expression>, StateType ByRef, State[] ByRef, Boolean)
at Microsoft.EntityFrameworkCore.Query.Internal.ExpressionTreeFuncletizer.VisitMethodCall(System.Linq.Expressions.MethodCallExpression)
at Microsoft.EntityFrameworkCore.Query.Internal.ExpressionTreeFuncletizer.Visit(System.Linq.Expressions.Expression)
--------------------------------
at Microsoft.EntityFrameworkCore.Query.Internal.ExpressionTreeFuncletizer.Visit[[System.__Canon, System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]](System.Collections.ObjectModel.ReadOnlyCollection`1<System.__Canon>, System.Func`2<System.__Canon,System.__Canon>, StateType ByRef, State[] ByRef, Boolean)
at Microsoft.EntityFrameworkCore.Query.Internal.ExpressionTreeFuncletizer.Visit(System.Collections.ObjectModel.ReadOnlyCollection`1<System.Linq.Expressions.Expression>, StateType ByRef, State[] ByRef, Boolean)
Libraries:
EFCore.BulkExtensions.PostgreSql
Version="8.1.2"Microsoft.EntityFrameworkCore.Design
Version="9.0.0"Npgsql.EntityFrameworkCore.PostgreSQL
Version="9.0.2"Npgsql.EntityFrameworkCore.PostgreSQL.Design
Version="1.1.0"Here's an example how to reproduce the issue:
private class SimpleDbContext : DbContext
{
// Stores the values 0, 1, 2, ..., 100000
public virtual DbSet<SequencePoint> SequencePoints { get; set; }
}
private class SequencePoint
{
public int SequenceNumber { get; set; }
}
private void ConcatErrorTest()
{
SimpleDbContext simpleDbContext = new();
List<Tuple<int, int>> selectRanges = new(); // 0, 10, 11, 20, 21, 21, etc...
for (int i = 0; i < 7500; i++)
{
int startRange = i * 10;
int endRange = startRange + (i % 5);
selectRanges.Add(new Tuple<int, int>(startRange, endRange));
}
IQueryable<SequencePoint> queryable = null;
foreach (Tuple<int,int> selectRange in selectRanges)
{
IQueryable<SequencePoint> whereQueryable = simpleDbContext.SequencePoints.AsQueryable().Where(point =>
(point.SequenceNumber >= selectRange.Item1) &&
(point.SequenceNumber <= selectRange.Item2)
);
queryable = queryable == null ? whereQueryable : queryable.Concat(whereQueryable);
}
// Throws Stack overflow.
List<int> result = queryable.Select(sequenceNumber => sequenceNumber.SequenceNumber).ToList();
_logger.LogInformation("result = {result}", result);
}
Thanks for the responses.
In my case the example problem here was a simplification of my actual problem but the responses helped me get on the right track.
The most entity framework way of approaching this problem was to have the ranges in another table and use a multi select. Note a Join would not work here as you can't do that with a LINQ joins - LINQ only supports equijoins
.
Here's what the code would look like:
private class SimpleDbContext : DbContext
{
// Stores the values 0, 1, 2, ..., 100000
public virtual DbSet<SequencePoint> SequencePoints { get; set; }
// [0, 0], [10, 11], [20, 21, 22], etc...
public virtual DbSet<SelectionRange> SelectionRanges { get; set; }
}
private class SequencePoint
{
public int SequenceNumber { get; set; }
}
private class SelectionRange
{
public int LowerRange { get; set; }
public int UpperRange { get; set; }
}
private void ConcatReworked()
{
SimpleDbContext simpleDbContext = new();
IQueryable<SequencePoint> queryable = from sequencePoint in simpleDbContext.SequencePoints
from selectionRange in simpleDbContext.SelectionRanges
where sequencePoint.SequenceNumber >= selectionRange.LowerRange
where sequencePoint.SequenceNumber <= selectionRange.UpperRange
orderby sequencePoint.SequenceNumber
select sequencePoint;
List<int> result = queryable.Select(sequenceNumber => sequenceNumber.SequenceNumber).ToList();
_logger.LogInformation("result = {result}", result);
}
The other approach I tried was to write the Query manually using a StringBuilder which looked something along this lines of:
private async Task LoadSelection(List<SelectionRange> selectionRange)
{
using NpgsqlConnection connection = await npgsqlDataSource.OpenConnectionAsync();
StringBuilder commandSqlBuilder = new();
commandSqlBuilder.Append("SELECT ");
// ... More SQL here
for (int i = 0; i < selectionRange.Count; i++)
{
SelectionRange range = selectionRange[i];
commandSqlBuilder.Append("OR (");
commandSqlBuilder.Append($"value >= (@range_lower_param_{i})");
commandSqlBuilder.Append($"value <= (@range_upper_param_{i})");
// ... More SQL here
}
// ... More SQL here
NpgsqlCommand command = new NpgsqlCommand(commandSqlBuilder.ToString(), connection);
command.Parameters.AddWithValue(...);
NpgsqlDataReader reader = await command.ExecuteReaderAsync();
while (await reader.ReadAsync())
}
While this does work, and runs about twice as fast as the join, it does have new limitations such as A statement cannot have more than 65535 parameters
. So I needed to split it into smaller statements and join the results in code later. Also when the database names change in a migration I will need to come back an update this String Builder to reflect the changes.