Search code examples
c#linqlambda

Trying to figure out a c# linq for the list of objects which can be tagged with multiple categories. Get the category with the associated objects


list of entities tagged categories
entity1 (cat1, cat2, cat3)
entity2 (cat2, cat4, cat5)
entity3 (cat1, cat3, cat5)
entity4 (cat2, cat3)
entity5 (cat1, cat4, cat5)
Class Entity
{           
  Entity Name;
  List<Category> Categories 
}
 
var listentities= somemethod(); //returns List<Entity>  <br>

expected output:

--- ---
cat1 entity1, entity3,entity5
cat2 entity2,entity4
cat3 entity1,entity3,entity4

Note: I got the desired output using 2 foreach loops and dictionary but want to do it using linq.


Solution

  • I defined an Entity class as

    public class Entity(string name, List<string> categories)
    {
        public string Name { get; } = name;
    
        public List<string> Categories { get; } = categories;
    }
    

    If a category is a custom type, it may need to be equatable.

    I defined a test SomeMethod method as

    static List<Entity> SomeMethod()
    {
        return new List<Entity>{
            new Entity("Ent 1", new List<string> { "1", "2", "3", }),
            new Entity("Ent 2", new List<string> { "2", "4", "5", }),
            new Entity("Ent 3", new List<string> { "1", "3", "5", }),
            new Entity("Ent 4", new List<string> { "2", "3", }),
            new Entity("Ent 5", new List<string> { "1", "4", "5", }),
        };
    }
    

    Double Loop - No Linq

    First, an approach that doesn't use Linq.

    var listEntities = SomeMethod();
    var dict = new Dictionary<string, IList<Entity>>();
    foreach (var entity in listEntities)
    {
        foreach (var category in entity.Categories)
        {
            if (!dict.TryGetValue(category, out var value))
            {
                value = new List<Entity>();
                dict.Add(category, value);
            }
    
            value.Add(entity);
        }
    }
    
    foreach (var key in dict.Keys)
    {
        Console.WriteLine($"{key}: {string.Join(", ", dict[key].Select(e => e.Name))}");
    }
    

    Basically, loop over the list of Entity objects and loop over the categories within an Entity object to fill a Dictionary.

    Linq

    This time using Linq.

    var listEntities = SomeMethod();
    var groupByCategory =
        listEntities.SelectMany(e => e.Categories.Select(c => new { Category = c, Entity = e, }))
            .GroupBy(ce => ce.Category, (cat, ents) => new { Category = cat, Entities = ents });
    
    foreach (var group in groupByCategory)
    {
        Console.WriteLine($"{group.Category}: {string.Join(", ", group.Entities.Select(e => e.Entity.Name))}");
    }
    

    The return of GroupBy is roughly equivalent to the dictionary and it can be 'cleaned up' a bit by supplying a resultSelector.

    Both approaches produce the following output:

    1: Ent 1, Ent 3, Ent 5
    2: Ent 1, Ent 2, Ent 4
    3: Ent 1, Ent 3, Ent 4
    4: Ent 2, Ent 5
    5: Ent 2, Ent 3, Ent 5