I'm trying to remove duplicates from my database. I'm using Entity Framework Core and .NET 5. EF Core is having trouble materializing my group by:
protected async Task RemoveDuplicates(CryptoInfoContext cryptoContext)
{
try
{
var duplicates = cryptoContext.HistoricalCandles
.GroupBy(x => new { x.StartDate, x.GranularitySeconds })
.Where(x => x.Count() > 1)
.ToList()
.Select(x => x.FirstOrDefault())
.ToList();
cryptoContext.RemoveRange(duplicates);
await cryptoContext.SaveChangesAsync();
}
catch(Exception ex)
{
Console.WriteLine(ex);
}
}
I'm getting an error:
Unable to translate the given 'GroupBy' pattern. Call 'AsEnumerable' before 'GroupBy' to evaluate it client-side
I don't feel like materializing all of my rows to remove the duplicates. Is there a list of known issues with group by? how can I work around this issue?
Thanks to @GertArnold for pointing me in the right direction. Apparently EF Group by only support aggregate queries.
As pointed out by this article:
var ctx = new EntertainmentDbContext(conString);
var dataTask = ctx
.Ratings
.GroupBy(x => x.Source)
.Select(x => new {Source = x.Key, Count = x.Count()})
.OrderByDescending(x => x.Count)
.ToListAsync();
var data = await dataTask;
Which will generate SQL like so:
SELECT "r"."Source", COUNT(*) AS "Count"
FROM "Ratings" AS "r"
GROUP BY "r"."Source"
ORDER BY COUNT(*) DESC
Note: currently pretty much nothing else works, you CANNOT even apply a where before the group by.
Additionally there is an open EF Core issue Please vote on it so they actually fix this instead of pushing it another release