Search code examples
.netmathcombinationspermutationpowerset

.NET - Calculating combinations with additional rules and twists


UPDATE: @bphelpsjr answer provides what I am looking for. Unfortunately someone down-voted him and I do not have the rep to up-vote. I am marking his response as the answer.

This is extremely long winded but I wanted to provide as much detail as possible.

Essentially, I want to take a set of data and generate a list of lists based on rules (defined below). This is essentially a filtered version of a powerset.

I will then store these results for repeated use (similar to that of a rainbow table) to avoid constant calculation of the same N. I will then use variable substitution (e.g., A = 18, B = 30) before applying other logic (not described below, not necessary for my question).

Here are two input options I've experimented with in attempting to create a solution. You could also use numbers instead of letters.

Input Option #1

var completeList = new List<Item>
        {
            new Item('A', 'A'),
            new Item('A', 'B'),
            new Item('A', 'C'),
            new Item('A', 'D'),            
            new Item('B', 'B'),
            new Item('B', 'C'),
            new Item('B', 'D'),           
            new Item('C', 'C'),
            new Item('C', 'D'),            
            new Item('D', 'D')
        };

Input Option #2

List<Item> aList = new List<Item> 
{
        new Item('A', 'A'),
        new Item('A', 'B'),
        new Item('A', 'C'),
        new Item('A', 'D'),            
    };

    List<Item> bList = new List<Item> 
    {
        new Item('B', 'B'),
        new Item('B', 'C'),
        new Item('B', 'D'),           
    };

    List<Item> cList = new List<Item> 
    {
        new Item('C', 'C'),
        new Item('C', 'D'),            
    };

    List<Item> dList = new List<Item> 
    {
        new Item('D', 'D')
    };

Desired Output

AA BB CC DD
AA BB CD
AA BC    DD
AA BD CC
AB    CC DD 
AB    CD
AC BB    DD
AC BD
AD BB CC
AD BC

Rules

The first 3 are definitive rules while the 4th is more a desire.

  1. Solution must be able to handle N number of distinct letters and lists of items

  2. Every distinct letter must appear at least once in the list of items. Example:

    AA BB CC DD <-- Valid

    AA BB CC <-- invalid, does not contain D

  3. Letters may only repeat within a given item. Example:

    AA BB CC DD <-- valid

    AA BA CC DD <-- invalid, A is repeated in a different item

  4. The logic must contain as much "aggressive filtering" and short circuiting as possible in order to cut down on the number of iterations that it will perform. I had a working left-shift solution but it does not scale whatsoever due to the (my?) inability to incorporate the filtering and short circuiting. This basically resulted in iterating through the entire powerset.

    • Example: Once a letter is found that is already contained within a potential list's items, move on to the next potential combination because this one is invalid.

    • Example: Once a valid list of items has been found, start the next round.

The next two are potential examples based on the way I currently have the data set grouped by the first letter of each item. They may not be applicable depending on what type of solution you're creating.

  • Potential Example: If an item contains a letter that is in the next list's items, skip that entire list and move to the next list of items.

    AA BC DD <-- We can skip the C list because it is covered by BC

  • Potential Example: Once you have exhausted a list's potential candidates (e.g., the last list will only ever have 1 item), you shouldn't (if my thinking is correct) need that list again until the list above it + 1 has changed items.

    AA BB CC DD <-- after you find this, stop searching the list containing DD until you get to BC (list above DD + 1)

    AA BB CD

    AA BC DD <-- We need DD again

    1. No list of items should repeat itself, regardless of the order of items. Example:

    AA BB CC DD == BB AA DD CC so do not include BB AA DD CC

Assumptions I've made:

  • It will be easier to group the sets by their initial starting letter (see sample data below). If this is not the optimal approach, it is not an issue.

Below is the Item class I use to hold my data simply for convenience:

public class Item
{
    public char First { get; set; }
    public char Second { get; set; }

    public Item(char first, char second)
    {
        First = first;
        Second = second;

    public List<char> DistinctCharacters()
    {
        return First == Second ? new List<char> { First } : new List<char> { First,  Second };
    }
}

Solution

  • This should work (using numbers instead of letters):

        private static BlockingCollection<List<int[]>> GetCombinations(int toConsider)
        {
            var allResults = new BlockingCollection<List<int[]>>();
            var possibilities = Enumerable.Range(0, toConsider).ToList();
            Parallel.ForEach(possibilities, possibility =>
            {
                GetIteratively(new List<int[]> { new[] { 0, possibility } }, allResults, possibilities.RemoveAllClone(x => x == 0 || x == possibility));
            });
            return allResults;
        }
        public static void GetIteratively(List<int[]> result, BlockingCollection<List<int[]>> allResults, List<int> possibilities)
        {
            Stack<Tuple<List<int[]>, List<int>>> stack = new Stack<Tuple<List<int[]>, List<int>>>();
            stack.Push(new Tuple<List<int[]>,List<int>>(result, possibilities));
            while (stack.Count > 0)
            {
                var pop = stack.Pop();
                if (pop.Item2.Count > 0)
                    pop.Item2.ForEach(x => stack.Push(new Tuple<List<int[]>, List<int>>(new List<int[]>(result) { new int[] { pop.Item2[0], x } }, pop.Item2.RemoveAllClone(y => (y == pop.Item2[0] || y == x)))));
                else
                    allResults.Add(result);
            }   
        }
    

    And here is the LinqExtension for RemoveAllClone

        public static List<T> RemoveAllClone<T>(this IEnumerable<T> source, Predicate<T> match)
        {
            var clone = new List<T>(source);
            clone.RemoveAll(match);
            return clone;
        }