Search code examples
algorithmsetsumsubsetdata-partitioning

Partition a Set into k Disjoint Subset


Give a Set S, partition the set into k disjoint subsets such that the difference of their sums is minimal.

say, S = {1,2,3,4,5} and k = 2, so { {3,4}, {1,2,5} } since their sums {7,8} have minimal difference. For S = {1,2,3}, k = 2 it will be {{1,2},{3}} since difference in sum is 0.

The problem is similar to The Partition Problem from The Algorithm Design Manual. Except Steven Skiena discusses a method to solve it without rearrangement.

I was going to try Simulated Annealing. So i wondering, if there was a better method?

Thanks in advance.


Solution

  • The pseudo-polytime algorithm for a knapsack can be used for k=2. The best we can do is sum(S)/2. Run the knapsack algorithm

    for s in S:
        for i in 0 to sum(S):
            if arr[i] then arr[i+s] = true;
    

    then look at sum(S)/2, followed by sum(S)/2 +/- 1, etc.

    For 'k>=3' I believe this is NP-complete, like the 3-partition problem.

    The simplest way to do it for k>=3 is just to brute force it, here's one way, not sure if it's the fastest or cleanest.

    import copy
    arr = [1,2,3,4]
    
    def t(k,accum,index):
        print accum,k
        if index == len(arr):
            if(k==0):
                return copy.deepcopy(accum);
            else:
                return [];
    
        element = arr[index];
        result = []
    
        for set_i in range(len(accum)):
            if k>0:
                clone_new = copy.deepcopy(accum);
                clone_new[set_i].append([element]);
                result.extend( t(k-1,clone_new,index+1) );
    
            for elem_i in range(len(accum[set_i])):
                clone_new = copy.deepcopy(accum);
                clone_new[set_i][elem_i].append(element)
                result.extend( t(k,clone_new,index+1) );
    
        return result
    
    print t(3,[[]],0);
    

    Simulated annealing might be good, but since the 'neighbors' of a particular solution aren't really clear, a genetic algorithm might be better suited to this. You'd start out by randomly picking a group of subsets and 'mutate' by moving numbers between subsets.