Search code examples
c#linqmultiple-tableslinq-group

How to group more than two tables using Linq


I have three tables ("Ratings", "Comments" and "Users"). One rating can have multiple comments. One comment belongs to one user.

So far, I have this LINQ statement which is working fine.

from rating in Ratings
join comment in Comments
on rating.ID equals comment.RatingID
join user in Users
on comment.UserID equals user.ID
select new {rating, comment, user}
.Where(x => x.rating.LocationID == 3);

How can I group this?


Solution

  • It depends a bit on what you want to group by. But there a multiple solutions.

    Solution 1:

    Let's say you would like to group by rating, then you could do:

    var query1 = from rating in db.Ratings
                    join comment in db.Comments
                        on rating.ID equals comment.RatingID
                    join user in db.Users
                        on comment.UserID equals user.ID
                    group new { comment, user } by rating into g
                    select new { g.Key, l = g.ToList() };
    
    foreach (var row in query1)
    {
        // you get rows grouped by rating
        Debug.WriteLine(row.Key.ID); // rating.ID
    
        // and a list of comments/users per rating
        foreach (var g in row.l)
        {
            Debug.WriteLine(g.comment.ID);
            Debug.WriteLine(g.user.ID);
        }
    }
    

    This gives you a single line per rating. And the dynamic object g contains a List of comment/user pairs per rating.

    Solution 2:

    However, like @gertarnold mentioned, it's easier to just read in the object you want to group by. And then traverse it's properties. Like this:

    var query2 = db.Ratings;
    
    foreach (var rating in query2)
    {
        Debug.WriteLine(rating.name);
        foreach (var comment in rating.Comments)
        {
            Debug.WriteLine(comment.name);
            Debug.WriteLine(comment.User.name);
        }
    }
    

    This is a lot easier to comprehend. It has a performance downside tough, because it performs a separate database SELECT-statement for each comment within the inner loop. If a rating has many many comments, then this is very slow. The first example with the grouping pulls in everything in a single database SELECT statement, which is a lot faster.

    Solution 3:

    And there is a way to have the best of both solutions. Do it like in solution 2 and add some DataLoadOptions in front of it:

    DataLoadOptions options = new DataLoadOptions();
    options.LoadWith<Rating>(rating => rating.Comments);
    options.LoadWith<Comment>(comment => comment.User);
    db.LoadOptions = options;
    

    This will preload all ratings with all the necessary child objects in a single SELECT. It's fast and easy to read and comprehend.

    PS: Just a side note: Tables should be named in singular. In that case Rating, Comment and User instead of Ratings, Comments and Users.

    PS2: To get also ratings without comments in solution 1, you need to convert the joins into outer joins. Like this:

    var query1 = from rating in db.Ratings
            join comment in db.Comments
                on rating.ID equals comment.RatingID into j1
            from comment in j1.DefaultIfEmpty()
            join user in db.Users
                on comment.UserID equals user.ID into j2
            from user in j2.DefaultIfEmpty()                                
            group new { comment, user } by rating into g
            select new { g.Key, l = g.ToList() };
    

    see also: 101 LINQ Samples - Left outer join