Search code examples
c#.netlinqgroup-byaggregation

Linq group the columns combination to a two list


I have question about how I could use Linq grouping the same combination that the list has then relate to a two list.

Example:

I have theses classes.

public class PetCategoryOwner
{
    public string PetCategory { get; set; }
    public string Owner { get; set; }
}

public class PetCategoriesOwners
{
    public IEnumerable<string> PetCategories { get; set; }
    public IEnumerable<string> Owners { get; set; }
}

The example data.

Owner Pet Category
Higa Terry
Higa Charlotte
Oliver Terry
Oliver Charlotte
Oliver Chausie
Price Chausie
Liam Terry
Liam Chartreux
var petCategoryOwner = new List<PetCategoryOwner>()
{
    new PetCategoryOwner { Owner = "Higa", PetCategory = "Terry"},
    new PetCategoryOwner { Owner = "Higa", PetCategory = "Charlotte"},
    new PetCategoryOwner { Owner = "Oliver", PetCategory = "Terry"},
    new PetCategoryOwner { Owner = "Oliver", PetCategory = "Charlotte"},
    new PetCategoryOwner { Owner = "Oliver", PetCategory = "Chausie"},
    new PetCategoryOwner { Owner = "Price", PetCategory = "Chausie"},
    new PetCategoryOwner { Owner = "Liam", PetCategory = "Terry"},
    new PetCategoryOwner { Owner = "Liam", PetCategory = "Chartreux"}
};

Expected values

Owner Pet Category Group
Higa Terry A
Higa Charlotte A
Oliver Terry A
Oliver Charlotte A
Oliver Chausie B
Price Chausie B
Liam Terry C
Liam Chartreux C
var petCategoriesOwners = new List<PetCategoriesOwners>()
{
    new PetCategoriesOwners()
    {
        PetCategories = new List<string>() { "Terry", "Charlotte" },
        Owners = new List<string>() { "Oliver", "Higa" }
    },
    new PetCategoriesOwners()
    {
        PetCategories = new List<string>() { "Chausie" },
        Owners = new List<string>() { "Oliver", "Price" }
    },
    new PetCategoriesOwners()
    {
        PetCategories = new List<string>() { "Chartreux", "Terry" },
        Owners = new List<string>() { "Liam" }
    }
}

Solution

  • In order to solve your problem you need to do two steps: group by owners and merge owners based on the fact other group's set is a subset of current owner. You can try to achieve it by running below LINQ query:

    public class PetCategoriesOwners
    {
        public List<string> PetCategories { get; set; }
        public List<string> Owners { get; set; }
    }
    
    var petCategoriesOwners = petCategoryOwner
        .GroupBy(x => x.Owner)
        .Select(x => new
        {
            Owner = x.Key,
            Categories = x.Select(y => y.PetCategory)
        })
        .OrderBy(x => x.Categories.Count())
        .Aggregate(new List<PetCategoriesOwners>(), (acc, current) =>
        {
            var currentCategories = current.Categories.ToList();
            var matches = acc.Where(group => group.PetCategories.All(x => currentCategories.Contains(x)));
    
            foreach(var match in matches)
            {
                match.Owners.Add(current.Owner);
                currentCategories = currentCategories.Except(match.PetCategories).ToList();
            }
    
            if (currentCategories.Any())
            {
                acc.Add(
                    new PetCategoriesOwners() { 
                        Owners = new List<string>() { current.Owner }, 
                        PetCategories = currentCategories 
                    });
            }
    
            return acc;
        });
    

    So it's important to group by Owner, process groups in ascending order in terms of length. The Aggregate method basically tries to find if previosly entered item overlaps with currently processed one. If it happens then we take those intersecting elements, add owner there and remove those from current element. If any element is left then we create own group for such owner.

    Edit: .NET Fiddle