Search code examples
pythonpython-3.xalgorithmdata-structuresgraph-algorithm

Select n items from a set of subsets


I'm wondering if there exists an algorithm that can solve this problem:

Suppose you have a set with sets in it where each set may or may not have elements, for example, let the possible elements of the sets be 1,2, and 3, then we would have for example a set like {{1,2,3}{1}{1,2} ... } so, how do I select a number of sets such that I have n elements of each item, for example, let n=200 then I want 200 1s, 200 2s and 200 3s in this example.


Solution

  • before I dive into an answer there are some interrogatives.

    suppose we have a method f() that given a set of sets (as the one in your example), what does it need to find?

    If we plug n = 3 to f(3) it needs to find the sets such that we have n elements of each item ( 3 1s, 3 2s, and 3 3s) exactly? Let's suppose this scenario.

    Then the output of our method should be the subsets of the sets that satisfy the previous problem or do we only need to return the number of sets in our subsets? for example f(3) in a set of sets as follows

    [[1,2,3][1][1,2][1,2][1,2,3][3]... ]
    

    we can see that a solution subset is [[1,2,3][1,2][1,2,3][3]] We can return this subset or rather the number of sets in this subset which is 4. (This could be relevant for the space complexity of the algorithm).

    Now to the solution, a Naive approach could be of this form:

    1. loop through the set of sets.
    2. for each element in the individual set, increase an element counter (for example the set [1,2] will increase the 1 and 2 counter by one each.
      1. if one of the element count is greater than 'n', then we do not consider that set into our solution.
      2. else, consider the set into your solution set.
      3. Whatever the case, call your method again but don't consider the previous set.
      4. return if you find that each element has increased the counter equal to 'n'.

    a pseudo code will look something like this:

    method find_sets(n,sets, sol_sets, element_counters):
    
        for each set in sets:
            updated_counter = element_counter +  1 for each element in set
            sets = sets.pop(set)
    
            if for a element in updated_counter > n:
                return method find_sets(n, sets, sol_sets, element_counters)
    
            else if for any element in updated_counter < n:
                sol_sets.add(set)
                return method find_sets(n, sets, sol_sets, updated_counter)
    
            else if for every element in updated_counter = n:
                sol_sets.add(set)
                return sol_sets
    
        # If the solution could not be found
         return None
    

    Considering a set of N sets, the worst-case scenario is to loop for each element, and considering this is a recursive algorithm, we can expect a time complexity of n².

    Hope this could bring light to finding a better solution.