I am having a problem with Entity Framework Core 5.x where the "wrong" query is being generated when trying to find the collection of entities where a navigation property collection contains all elements from another list.
Given the following simplified (but standard many-to-many) model
public class Announcement {
public string Id { get; set; }
public string Title { get; set; }
public virtual IEnumerable<AnnouncementTag> Tags { get; set; }
}
public class Tag {
public string Id { get; set; }
public bool IsEnabled { get; set; }
public virtual IEnumerable<AnnouncementTag> Announcements{ get; set; }
}
// Needed since EF Core does not support many-to-many without explicit JOIN class
public class AnnouncementTag {
public string AnnouncementId { get; set; }
public string TagId { get; set; }
public Announcement Announcement { get; set; }
public Tag Tag { get; set; }
}
I would like to get the list of Announcements which have all of their tags match what is in a given list.
For example, with the following data
Announcement 1 has Tags A, B
Announcement 2 has Tags B
Announcement 3 has Tags C
A query for "all announcements with tags ["B"]" would return Announcement 2 since Announcement 3 has nothing that matches and Announcement 1 only matches on one tag.
I would have thought that the following LINQ query for return the results that I want
var applicableTags = new [] { "B" };
var results = db.Announcements.Where(a => a.Tags.All(at => applicableTags.Contains(at.Tag.Id));
But that results in all Announcements coming back. Just to simplify the query, I modified it to use the JOIN class directly (TagId = Tag.Id)
var results = db.Announcements.Where(a => a.Tags.All(at => applicableTags.Contains(at.TagId));
But that results in the same where all Announcements are returned. The generated query seems odd
SELECT * FROM [Announcements] AS [a]
WHERE NOT EXISTS (
SELECT 1
FROM [AnnouncementTag] AS [a1]
INNER JOIN [Tags] AS [t1] ON [a1].[TagId] = [t1].[Id]
-- Really... <>... opposite of what I would have expected
-- When I query for more than one Tag...say A,B...the <> becomes "NOT IN"
WHERE ([a].[Id] = [a1].[AnnouncementId]) AND ([t1].[Id] <> N'B')
)
I was able to duplicate this behavior with LINQ-to-Objects so it does not seem to be an Entity Framework thing... just me not understanding how to generate the query.
I did find one article that stated the following work
// Notice how the applicableTags and Tags criteria is flipped from previous query
var results = db.Announcements.Where(a => applicableTags.All(at => a.Tags.Any(t => t.Id == at)));
And it does...but only for LINQ-to-Objects. For Entity Framework, you get the following exception
System.InvalidOperationException: The LINQ expression 'DbSet<Announcement>()
.Where(a => __applicableTags_0
.All(at => DbSet<AnnouncementTag>()
.Where(a0 => EF.Property<string>(a, "Id") != null && object.Equals(
objA: EF.Property<string>(a, "Id"),
objB: EF.Property<string>(a0, "AnnouncementId")))
.Any(a0 => a0.TagId == at)))' could not be translated. Either rewrite the query in a form that can be translated, or switch to client evaluation explicitly by inserting a call to 'AsEnumerable', 'AsAsyncEnumerable', 'ToList', or 'ToListAsync'
But I would not want to have to bring back all Announcements in order to further filter on the server.
Any suggestions on what I am doing wrong? Can somebody please explain the flaw in my original query that is returning the unexpected results?
Assuming that tags are unique, you can express your query in the following way:
var tagCount = applicableTags.Length;
var results = db.Announcements
.Where(a => a.Tags.Where(t => applicableTags.Contains(t.Id)).Count() >= tagCount);