Search code examples
algorithmcombinatoricspseudocodemultiset

Generating the powerset of a multiset


Suppose I have a multiset

{a,a,a,b,c}

from which I can make the following selections:

{}
{a}
{a,a}
{a,a,a}
{a,a,a,b}
{a,a,a,b,c}
{a,a,a,c}
{a,a,b}
{a,a,b,c}
{a,a,c}
{a,b}
{a,b,c}
{a,c}
{b}
{b,c}
{c}

Notice that the number of selections equals 16. The cardinality of a powerset of a multiset, card(P(M)), is defined on OEIS as

card(P(M)) = prod(mult(x) + 1) for all x in M

where mult(x) is the multiplicity of x in M and prod is the product of the terms. So for our example, this would amount to 4 x 2 x 2 = 16.

Let's say, for example, that the multiplicity of these elements is very high:

m(a) = 21
m(b) = 36
m(c) = 44

Then

card(P(M)) = 22 * 37 * 45 = 36630.

But if we were to treat all those elements as distinct - as a set - the cardinality of the powerset would be

card(P(S)) = 2^(21+36+44) = 2535301200456458802993406410752.

The "standard" solution for this problem suggests to just compute the powerset of the set where all of the elements are treated as distinct, and then prune the results to remove the duplicates. That's a solution with O(2^n) complexity.

Does a general algorithm for generating a powerset of a multiset with complexity on the order of card(P(M)) exist?


Solution

  • powerset recipe with itertools

    What you are asking is usually called the powerset and is available as an itertools recipe, as well as a function in the module more_itertools. See the documentation:

    multiset = ['a', 'a', 'a', 'b', 'c']
    
    #
    # USING ITERTOOLS
    #
    import itertools
    
    def powerset(iterable):
        "powerset([1,2,3]) --> () (1,) (2,) (3,) (1,2) (1,3) (2,3) (1,2,3)"
        s = list(iterable)
        return itertools.chain.from_iterable(itertools.combinations(s, r) for r in range(len(s)+1))
    
    print(list(powerset(multiset)))
    # [(), ('a',), ('a',), ('a',), ('b',), ('c',), ('a', 'a'), ('a', 'a'), ('a', 'b'), ('a', 'c'), ('a', 'a'), ('a', 'b'), ('a', 'c'), ('a', 'b'), ('a', 'c'), ('b', 'c'), ('a', 'a', 'a'), ('a', 'a', 'b'), ('a', 'a', 'c'), ('a', 'a', 'b'), ('a', 'a', 'c'), ('a', 'b', 'c'), ('a', 'a', 'b'), ('a', 'a', 'c'), ('a', 'b', 'c'), ('a', 'b', 'c'), ('a', 'a', 'a', 'b'), ('a', 'a', 'a', 'c'), ('a', 'a', 'b', 'c'), ('a', 'a', 'b', 'c'), ('a', 'a', 'b', 'c'), ('a', 'a', 'a', 'b', 'c')]
    
    #
    # USING MORE_ITERTOOLS
    #
    import more_itertools
    
    print(list(more_itertools.powerset(multiset)))
    # [(), ('a',), ('a',), ('a',), ('b',), ('c',), ('a', 'a'), ('a', 'a'), ('a', 'b'), ('a', 'c'), ('a', 'a'), ('a', 'b'), ('a', 'c'), ('a', 'b'), ('a', 'c'), ('b', 'c'), ('a', 'a', 'a'), ('a', 'a', 'b'), ('a', 'a', 'c'), ('a', 'a', 'b'), ('a', 'a', 'c'), ('a', 'b', 'c'), ('a', 'a', 'b'), ('a', 'a', 'c'), ('a', 'b', 'c'), ('a', 'b', 'c'), ('a', 'a', 'a', 'b'), ('a', 'a', 'a', 'c'), ('a', 'a', 'b', 'c'), ('a', 'a', 'b', 'c'), ('a', 'a', 'b', 'c'), ('a', 'a', 'a', 'b', 'c')]
    

    Powerset of a collections.Counter object

    In Python, multisets are usually represented with a collections.Counter rather than with a list. The class collections.Counter is a subclass of dict; it implements dictionaries that map elements to counts, as well as several useful methods such as building a Counter by counting occurrences in a sequence.

    Taking the powerset of a Counter is the topic of another question on stackoverflow:

    Although I am not aware of an already-implemented method doing this in standard modules, the answer to that question presents one solution using itertools:

    import collections
    import itertools
    
    multiset = collections.Counter(['a', 'a', 'a', 'b', 'c'])
    # Counter({'a': 3, 'b': 1, 'c': 1})
    
    def powerset(multiset):
        range_items = [[(x, z) for z in range(y + 1)] for x,y in multiset.items()]
        products = itertools.product(*range_items)
        return [{k: v for k, v in pairs if v > 0} for pairs in products]
    
    print(powerset(multiset))
    # [{}, {'c': 1}, {'b': 1}, {'b': 1, 'c': 1}, {'a': 1}, {'a': 1, 'c': 1}, {'a': 1, 'b': 1}, {'a': 1, 'b': 1, 'c': 1}, {'a': 2}, {'a': 2, 'c': 1}, {'a': 2, 'b': 1}, {'a': 2, 'b': 1, 'c': 1}, {'a': 3}, {'a': 3, 'c': 1}, {'a': 3, 'b': 1}, {'a': 3, 'b': 1, 'c': 1}]