Search code examples
python-3.xrecursiontime-complexityspace-complexitypowerset

What are the time and space complexities of this powerset function?


Hello I have been practicing algorithms and data structures, and I solved https://leetcode.com/problems/subsets/ which is the powerset function but it appears my solution is too slow. Here is the code:

def subsets(array):
    set = [array]
    n = len(array)
    for i in range(n):
        tmp_subsets = subsets(array[:i] + array[i+1:])
        for subset in tmp_subsets:
            if subset not in set:
                set.append(subset)
    return set

I know there are other better/faster solutions but I am trying to understand the time and space complexities of this one.

I think time complexity here is O(n^n) since the recursion tree has n branches per level and n levels total is it correct ?

So it is worse than the optimal O(2^n) solution.

I am not sure of space complexity though, my instincts say O(2^n) but when I draw out the recursion tree it's a little confusing because the space I am using at the point where my recursion stack is the biggest is:

space_for_level(0)+space_for_level(1)+...+space_for_level(n)
n + (n-1) +...+ 1 = O(n^2)

But in the end when I return everything the size of set is O(2^n) so my space complexity would be O(2^n + n^2) = O(2^n).

Is my analysis correct for both time and space ? Any pointers to help me think in a clearer way are welcome.


Solution

  • Time complexity

    The time complexity can be summed up to the following recurrence relation:

    T(n) = n * T(n-1) + n * 2^(n-1)

    since there is a main loop running n times and each time the subsets of the remaining elements (n-1) are first generated (T(n-1)) and then looped over (2^(n-1)).

    note the relation assumes every other operation in the loop takes constant time, which is a reasonable approximation as its effects as n grows are minimal. For example this operation:

    if subset not in set
    

    does not take constant time (it takes linear time in the list length, also set.append(subset) is not constant time in general), but lets overlook it for now (you get the picture and how to analyse the code).

    The recurrence relation suggests a complexity at least exponential.

    Space complexity

    First of all, all subsets of n are generated, this means at least a complexity of O(2^n). However due to recursion and re-generation of subsets during each step the space complexity is more than that. Specificaly in each loop step one running copy of subsets of n-1 are generated and then augmented to the original subsets. We have:

    S(n) = S(n-1) + 2^n

    since each call to subset generates 1 intermediate running copy of the remaining subsets (i.e S(n-1)) plus it combines those into the original subsets which are 2^n.

    note we do not count the storage needed to store each item of the subsets set, which itself requires a storage of complexity around O(n), but consider this to be of constant storage of O(1) for simplicity (for example storing a subset as a word, in binary encoding, for small n <= 64) so the whole storage of the subsets (without counting auxiliary storage) would be simply of O(2^n) complexity (else, as noted in comment, the storage complexity to simply store all subsets of n is of O(n 2^n) order).

    The recurrence relation suggests a complexity at least exponential.

    Solution

    Since the master theorem cannot be applied for these recurrence relations (as noted), check methods for solving first-order recurrence relations with variable coefficients in order to solve above relations (space complexity is like an arithmetic progression, while time complexity has more complex form) and acquire exact form of complexity (although the solution might be very complex and it will still be exponential one way or the other)

    Better complexity

    Better time and space complexity can be achieved by exploiting the structure and properties of subsets (given a definite ordering, eg lexicographic).

    Specifically the intimate relationship an already generated subset has to its predecessor and successor. So the successor (the next subset in sequence) can be generated with only minimal changes made to its predecessor (e.g adding or removing only a single element).

    For example the code generates many duplicates which it then tests if they already exist, this can be avoided all-together. Plus the recursion is not needed, so the time and space complexity of recursion can be eliminated. Furthermore only constant storage is needed inside the main loop (given that next subset can be generated with only minimal, constant amount of changes made to previous subset) and so on. All these optimisations exploit in one way or the other the symmetries and properties of the subsets under a definite ordering.

    PS. You can also research similar questions on computer.science.stackexchange which is involved with algorithmic complexity more fully