Search code examples
pythonalgorithmcomplexity-theorytime-complexitynp

Subset sum algorithm a little faster than 2^(n/2) in worst time?


After analyzing the fastest subset sum algorithm which runs in 2^(n/2) time, I noticed a slight optimization that can be done. I'm not sure if it really counts as an optimization and if it does, I'm wondering if it can be improved by recursion.

Basically from the original algorithm: http://en.wikipedia.org/wiki/Subset_sum_problem (see part with title Exponential time algorithm)

  • it takes the list and splits it into two
  • then it generates the sorted power sets of both in 2^(n/2) time
  • then it does a linear search in both lists to see if 1 value in both lists sum to x using a clever trick

In my version with the optimization

  • it takes the list and removes the last element last
  • then it splits the list in two
  • then it generates the sorted power sets of both in 2^((n-1)/2) time
  • then it does a linear search in both lists to see if 1 value in both lists sum to x or x-last (at same time with same running time) using a clever trick

If it finds either, then I will know it worked. I tried using python time functions to test with lists of size 22, and my version is coming like twice as fast apparently.

After running the below code, it shows

0.050999879837   <- the original algorithm
0.0250000953674   <- my algorithm

My logic for the recursion part is, well if it works for a size n list in 2^((n-1)/1) time, can we not repeat this again and again?

Does any of this make sense, or am I totally wrong?

Thanks

I created this python code:

from math import log, ceil, floor
import helper  # my own code
from random import randint, uniform
import time

# gets a list of unique random floats
# s = how many random numbers
# l = smallest float can be
# h = biggest float can be
def getRandomList(s, l, h):    
    lst = []
    while len(lst) != s:
        r = uniform(l,h)
        if not r in lst:
            lst.append(r)
    return lst

# This just generates the two powerset sorted lists that the 2^(n/2) algorithm makes.
# This is just a lazy way of doing it, this running time is way worse, but since
# this can be done in 2^(n/2) time, I just pretend its that running time lol
def getSortedPowerSets(lst):
    n = len(lst)
    l1 = lst[:n/2]
    l2 = lst[n/2:]

    xs = range(2**(n/2))
    ys1 = helper.getNums(l1, xs)
    ys2 = helper.getNums(l2, xs)

    return ys1, ys2

# this just checks using the regular 2^(n/2) algorithm to see if two values
# sum to the specified value
def checkListRegular(lst, x):
    lst1, lst2 = getSortedPowerSets(lst)

    left = 0
    right = len(lst2)-1
    while left < len(lst1) and right >= 0:
        sum = lst1[left] + lst2[right]
        if sum < x:
            left += 1
        elif sum > x:
            right -= 1
        else:
            return True
    return False

# this is my improved version of the above version
def checkListSmaller(lst, x):
    last = lst.pop()
    x1, x2 = x, x - last
    return checkhelper(lst, x1, x2)

# this is the same as the function 'checkListRegular', but it checks 2 values
# at the same time
def checkhelper(lst, x1, x2):    
    lst1, lst2 = getSortedPowerSets(lst)    

    left = [0,0]
    right = [len(lst2)-1, len(lst2)-1]

    while 1:
        check = 0
        if left[0] < len(lst1) and right[0] >= 0:
            check += 1
            sum = lst1[left[0]] + lst2[right[0]]
            if sum < x1:
                left[0] += 1
            elif sum > x1:
                right[0] -= 1
            else:
                return True
        if left[1] < len(lst1) and right[1] >= 0:
            check += 1
            sum = lst1[left[1]] + lst2[right[1]]
            if sum < x2:
                left[1] += 1
            elif sum > x2:
                right[1] -= 1
            else:
                return True
        if check == 0:
            return False


n = 22
lst = getRandomList(n, 1, 3000)


startTime = time.time()
print checkListRegular(lst, -50)  # -50 so it does worst case scenario
startTime2 = time.time()
print checkListSmaller(lst, -50)  # -50 so it does worst case scenario
startTime3 = time.time()

print (startTime2 - startTime)
print (startTime3 - startTime2)

This is the helper library which I just use to generate the powerset list.

def dec_to_bin(x):
    return int(bin(x)[2:])

def getNums(lst, xs):
    sums = []
    n = len(lst)
    for i in xs:
        bin = str(dec_to_bin(i))
        bin = (n-len(bin))*"0" + bin
        chosen_items = getList(bin, lst)
        sums.append(sum(chosen_items))
    sums.sort()
    return sums

def getList(binary, lst):
    s = []
    for i in range(len(binary)):
        if binary[i]=="1":
            s.append(float(lst[i]))
    return s

Solution

  • then it generates the sorted power sets of both in 2^((n-1)/2) time

    OK, since now the list has one less lement. However, this is not a big deal its just a constant time improvement of 2^(1/2)...

    then it does a linear search in both lists to see if 1 value in both lists sum to x or x-last (at same time with same running time) using a clever trick

    ... and this improvement will go away because now you do twice as many operations to check for both x and x-last sums instead of only for x

    can we not repeat this again and again?

    No you can't, for the same reason why you couldn't split the original algorithm again and again. The trick only works for once because once you start looking for values in more than two lists you can't use the sorting trick anymore.