Search code examples
pythonpython-3.xalgorithmpowerset

Powerset algorithm in Python: difference between + and append on lists


I’m working through the powerset problem in Python.

The powerset P(S) of a set S is the set of all subsets of S. For example, if S = {a, b, c} then P(s) = {{}, {a}, {b}, {c}, {a,b}, {a, c}, {b, c}, {a, b, c}}.

This solution works just fine:

def powerset(array):
    powerset = [[]]
    for num in array:
        for i in range(len(powerset)):
            curr_subset = powerset[i]
            powerset.append(curr_subset + [num])
    return powerset

However, this solution does not:

def powerset(array):
    powerset = [[]]
    for num in array:
        for i in range(len(powerset)):
            curr_subset = powerset[i]
            curr_subset.append(num)
            powerset.append(curr_subset)
    return powerset

It seems to overwrite every array in the powerset on each powerset.append operation. For an input of [1, 2, 3], I end up with a return value of:

[[1, 2, 2, 3, 3, 3, 3],
 [1, 2, 2, 3, 3, 3, 3],
 [1, 2, 2, 3, 3, 3, 3],
 [1, 2, 2, 3, 3, 3, 3],
 [1, 2, 2, 3, 3, 3, 3],
 [1, 2, 2, 3, 3, 3, 3],
 [1, 2, 2, 3, 3, 3, 3],
 [1, 2, 2, 3, 3, 3, 3]]

Any idea what I’m not fully understanding here?


Solution

  • The problem with your algorithm is that lists are mutable, and you are creating a list full of references to the same list. Any time you append to one of them, you are appending to all of them, because there is only one of them. They are all the same list, you just have more than one reference to it.

    Imagine having a list of 10 copies of somebody's phone number; if you call up the first phone number and ask them to put on a hat, then when you call the second phone number, the person on the other end will already be wearing a hat, because it is the same person. If you call each number and say "put on a hat" each time, you'll end up with a list of 10 phone numbers for one person wearing 10 hats, when you actually wanted phone numbers for 10 people wearing one hat each.

    The simplest way to design this kind of algorithm is to avoid mutation completely; use tuples instead of lists. This way, every time you add on another element to the tuple, you are creating a new tuple instead of changing the existing one.

    Note that this is quite similar to your first solution using curr_subset + [num]; the + operation creates a new list, unlike append which changes the state of an existing list.

    def powerset(array):
        # a list containing an empty tuple
        powerset = [()]
    
        for num in array:
            for i in range(len(powerset)):
                curr_subset = powerset[i]
                # append one more number to the tuple
                curr_subset += (num,)
                powerset.append(curr_subset)
    
        return powerset
    

    Example:

    >>> powerset([1, 2, 3])
    [(), (1,), (2,), (1, 2), (3,), (1, 3), (2, 3), (1, 2, 3)]