Search code examples
entity-frameworkentity-framework-6

Selecting Subcollections in Union-Query


I'm trying to select objects from the database with Entity Framework into an anonymous type. When using Union and Selecting a Subcollection, I get an exception:

System.ArgumentException: The 'Distinct' operation cannot be applied to the collection ResultType of the specified argument.

My model contains several types derived from BaseType. This base type has a reference to RefType which contains a collection of ItemType. The types derived from BaseType are stored in separate tables, thus the Union.

The query looks like this:

var q1 = ctx.Set<Type1>().Select(x => new { x.Id, x.Ref.Items });
var q2 = ctx.Set<Type2>().Select(x => new { x.Id, x.Ref.Items });
q1.Union(q2).ToList();

But to reproduce the error, you can even union queries of the same type, as long as you select a collection.

I would do the select after the union, but to union Type1, Type2, etc. I must cast them to BaseType, which is not allowed in LINQ-to-SQL.

Any way to do this in the same query?


Solution

  • The exception emerges from Entity Framework's query generation pipeline when the ExpressionConverter tries to translate the expression q1.Union(q2) into SQL.

    In a valid query you'll see that EF adds a DISTINCT clause to the SQL query. A type with collection properties (x.Ref.Items) doesn't pass as a valid argument for a Distinct operation and EF throws the exception you see.

    Unfortunately, using Concat instead of Union is not a valid work-around. EF will also throw an exception:

    The nested query is not supported. Operation1='UnionAll' Operation2='MultiStreamNest'

    Which means that it's simply not supported to concat nested queries containing types with collection properties.

    So you have to do the Union in memory:

    var result = q1.AsEnumerable() // Continues the query in memory
                   .Union(q2).ToList();
    

    C# doesn't have any problem with equating anonymous types containing collections: it simply considers any collection member as unequal to another collections member. This does mean that the query can produce a collection containing non-unique results (same Id, same Items) which may not be expected when relying on Union's implicit Distinct.