Search code examples
c#linqlinqpad

Linq produces wrong groupby output


The following code produces the incorrect output. It outputs 10 Keys each with 1 Value. If I delete the ToList() from Games it produces the correct output of 2 Keys each with 5 Values. Also, if it leave the Games.ToList() as is and remove the Dto from the GroupBy, it produces the correct output. How do I get this query to work correctly with the Games.ToList() and the Dto in place? (Games is a sql table. Note: When Games is a list it does not work, but when Games is IQueryable, it works).

void Main()
    {
        Games.ToList()
        .GroupBy(g => new MatchDto { MatchdayId = g.MatchdayId, KickOffDate = g.KickOffUtc.Date })
        .Select(grp => new
        {
            Key = grp.Key,
            Values = grp.ToList()
        })
        .Dump();
    }

public class MatchDto
{
    public int MatchdayId { get; set; }
    public DateTime KickOffDate { get; set; }
}

Solution

  • If Games is a list (or you keep the .ToList()), the query is executed on the client. There, GroupBy will calculate the key (which always is a new MatchDto object) and since MatchDto does not override Equals or implement IEquatable<MatchDto>, GroupBy correctly treats these as different. If Games was an EF table and you removed the ToList() call, then the grouping gets transpiled to SQL where the missing implementation of MatchDto does not matter.

    Hence, it is not that Linq produces a wrong output, it is that your implementation of MatchDto is missing a value type semantics that your are expecting but did not implement. The fastest way to implement this value type semantics would be if you turned MatchDto into a record.