UPDATE: @bphelpsjr answer provides what I am looking for. Unfortunately someone down-voted him and I do not have the rep to up-vote. I am marking his response as the answer.
This is extremely long winded but I wanted to provide as much detail as possible.
Essentially, I want to take a set of data and generate a list of lists based on rules (defined below). This is essentially a filtered version of a powerset.
I will then store these results for repeated use (similar to that of a rainbow table) to avoid constant calculation of the same N
. I will then use variable substitution (e.g., A = 18, B = 30) before applying other logic (not described below, not necessary for my question).
Here are two input options I've experimented with in attempting to create a solution. You could also use numbers instead of letters.
var completeList = new List<Item>
{
new Item('A', 'A'),
new Item('A', 'B'),
new Item('A', 'C'),
new Item('A', 'D'),
new Item('B', 'B'),
new Item('B', 'C'),
new Item('B', 'D'),
new Item('C', 'C'),
new Item('C', 'D'),
new Item('D', 'D')
};
List<Item> aList = new List<Item>
{
new Item('A', 'A'),
new Item('A', 'B'),
new Item('A', 'C'),
new Item('A', 'D'),
};
List<Item> bList = new List<Item>
{
new Item('B', 'B'),
new Item('B', 'C'),
new Item('B', 'D'),
};
List<Item> cList = new List<Item>
{
new Item('C', 'C'),
new Item('C', 'D'),
};
List<Item> dList = new List<Item>
{
new Item('D', 'D')
};
AA BB CC DD
AA BB CD
AA BC DD
AA BD CC
AB CC DD
AB CD
AC BB DD
AC BD
AD BB CC
AD BC
Rules
The first 3 are definitive rules while the 4th is more a desire.
Solution must be able to handle N number of distinct letters and lists of items
Every distinct letter must appear at least once in the list of items. Example:
AA BB CC DD <-- Valid
AA BB CC <-- invalid, does not contain D
Letters may only repeat within a given item. Example:
AA BB CC DD <-- valid
AA BA CC DD <-- invalid, A is repeated in a different item
The logic must contain as much "aggressive filtering" and short circuiting as possible in order to cut down on the number of iterations that it will perform. I had a working left-shift solution but it does not scale whatsoever due to the (my?) inability to incorporate the filtering and short circuiting. This basically resulted in iterating through the entire powerset.
Example: Once a letter is found that is already contained within a potential list's items, move on to the next potential combination because this one is invalid.
Example: Once a valid list of items has been found, start the next round.
The next two are potential examples based on the way I currently have the data set grouped by the first letter of each item. They may not be applicable depending on what type of solution you're creating.
Potential Example: If an item contains a letter that is in the next list's items, skip that entire list and move to the next list of items.
AA BC DD <-- We can skip the C list because it is covered by BC
Potential Example: Once you have exhausted a list's potential candidates (e.g., the last list will only ever have 1 item), you shouldn't (if my thinking is correct) need that list again until the list above it + 1 has changed items.
AA BB CC DD <-- after you find this, stop searching the list containing DD until you get to BC (list above DD + 1)
AA BB CD
AA BC DD <-- We need DD again
- No list of items should repeat itself, regardless of the order of items. Example:
AA BB CC DD == BB AA DD CC so do not include BB AA DD CC
Assumptions I've made:
Below is the Item class I use to hold my data simply for convenience:
public class Item
{
public char First { get; set; }
public char Second { get; set; }
public Item(char first, char second)
{
First = first;
Second = second;
public List<char> DistinctCharacters()
{
return First == Second ? new List<char> { First } : new List<char> { First, Second };
}
}
This should work (using numbers instead of letters):
private static BlockingCollection<List<int[]>> GetCombinations(int toConsider)
{
var allResults = new BlockingCollection<List<int[]>>();
var possibilities = Enumerable.Range(0, toConsider).ToList();
Parallel.ForEach(possibilities, possibility =>
{
GetIteratively(new List<int[]> { new[] { 0, possibility } }, allResults, possibilities.RemoveAllClone(x => x == 0 || x == possibility));
});
return allResults;
}
public static void GetIteratively(List<int[]> result, BlockingCollection<List<int[]>> allResults, List<int> possibilities)
{
Stack<Tuple<List<int[]>, List<int>>> stack = new Stack<Tuple<List<int[]>, List<int>>>();
stack.Push(new Tuple<List<int[]>,List<int>>(result, possibilities));
while (stack.Count > 0)
{
var pop = stack.Pop();
if (pop.Item2.Count > 0)
pop.Item2.ForEach(x => stack.Push(new Tuple<List<int[]>, List<int>>(new List<int[]>(result) { new int[] { pop.Item2[0], x } }, pop.Item2.RemoveAllClone(y => (y == pop.Item2[0] || y == x)))));
else
allResults.Add(result);
}
}
And here is the LinqExtension for RemoveAllClone
public static List<T> RemoveAllClone<T>(this IEnumerable<T> source, Predicate<T> match)
{
var clone = new List<T>(source);
clone.RemoveAll(match);
return clone;
}