Search code examples
c#.netlistlinqnumbers

Summary of elements from different lists, which are all related to each other


Imagine you've got an instance of List<List> with several entries. For example:

List<List<int>> inputLists = new List<List<int>>
{
    new List<int> { 4, 2, 1 },
    new List<int> { 5, 9, 1 },
    new List<int> { 6, 1, 2 },
    new List<int> { 9, 4, 3 },
    new List<int> { 9, 4, 2 }
};

I now want to summarize elements in bigger lists if each of them is related to each of them. The result lists should have a minimum amount of 4 elements tho. For the example above our result would be a List<List> with only one entry: { 4, 2, 1, 9 } (sorting irrelevant). Why is that?

Try to form pairs:

  • 9 is in the same list as 1 (check second list)
  • 9 is in the same list as 2 (check last)
  • 9 is in the same list as 4 (check fourth / last)
  • 4, 2 and 1 are all in the same list as well (check first; trivial)

Here is my code. I want to apology if it appears brain damaged, I did not get enough sleep lately.

internal void Test()
{
    List<List<int>> inputLists = new List<List<int>>
{
    new List<int> { 4, 2, 1 },
    new List<int> { 5, 9, 1 },
    new List<int> { 6, 1, 2 },
    new List<int> { 9, 4, 3 },
    new List<int> { 9, 4, 2 }
};

    foreach (var input in inputLists)
    {
        IEnumerable<IEnumerable<int>> results = Array.Empty<IEnumerable<int>>();
        input.ForEach(number =>
        {
            //first step: getting all numbers that are related to "number"
            //number is 4 | related ones are: 4, 2, 1, 9, 3
            IEnumerable<int> everyFrickingNumberAppearing = Array.Empty<int>();
            inputLists.Where(inputList => inputList.Contains(number)).ToList()
            .ForEach(inputList =>
            everyFrickingNumberAppearing =   everyFrickingNumberAppearing.Concat(inputList).Distinct()
            );


            var IReallyDontKnowWhatIAmDoing = everyFrickingNumberAppearing
            //trying out every number
            .Where(probablyRelatedNumber => inputLists
            //looking if there is at least one element that
            //appears in at least one list together with "number"
            .Any(list => list.Contains(probablyRelatedNumber) && list.Contains(number)))
            .ToArray();
            //appending all related numbers of "number" to results
            results.Append(IReallyDontKnowWhatIAmDoing);
        });
        //going through results
        results.ToList().ForEach(r =>
        {
            //selecting only nums which appear in every result
            var importantNums = r.Select(num => results.ToList().All(l => l.Contains(num)));
            //print it
            importantNums.ToList().ForEach(n => Console.Write($"{n} "));
            Console.WriteLine();
        });
    }
}

As mentioned above, I expected 4 2 1 9 (sorting not relevant) to be the output. Thank you for your help.


Solution

  • Let us create a HashSet<(int, int)> storing each pair of numbers occurring in the same list as value tuple. For sake of simplicity lets store the pairing twice as (x, y) and (y, x). This simplifies the further processing.

    var pairs = new HashSet<(int, int)>();
    foreach (var smallList in _inputLists) {
        foreach (int a in smallList) {
            foreach (int b in smallList) {
                if (a != b) {
                    pairs.Add((a, b));
                }
            }
        }
    }
    

    Since each pair appears twice as (x, y) and (y, x), we can now simply group by one of the items of the tuple (Item1) and count how many times it occurs to filter the ones appearing at least 4 times.

    var result = pairs
        .Select(p => p.Item1)
        .GroupBy(x => x)
        .Where(g => g.Count() >= 4)
        .OrderBy(g => g.Key);
    foreach (var g in result) {
        Console.WriteLine(g.Key);
    }
    

    Prints:

    1
    2
    4
    9