Search code examples
pythonalgorithmset-cover

Minimum set cover


I would like to solve a minimum set cover problem of the following sort. All the lists contain only 1s and 0s.

I say that a list A covers a list B if you can make B from A by inserting exactly x symbols.

Consider all 2^n lists of 1s and 0s of length n and set x = n/3. I would to compute a minimal set of lists of length 2n/3 that covers them all.

Here is a naive approach I have started on. For every possible list of length 2n/3 I create a set of all lists I can create from it using this function (written by DSM).

from itertools import product, combinations

def all_fill(source, num):
    output_len = len(source) + num
    for where in combinations(range(output_len), len(source)):
        # start with every possibility
        poss = [[0,1]] * output_len
        # impose the source list
        for w, s in zip(where, source):
            poss[w] = [s]
        # yield every remaining possibility
        for tup in product(*poss):
            yield tup

I then create the set of sets as follows using n = 6 as an example.

n = 6
shortn = 2*n/3
x = n/3
coversets=set()
for seq in product([0,1], repeat = shortn):
    coversets.add(frozenset(all_fill(seq,x)))

I would like to find a minimal set of sets from coversets whose union is allset = set(product([0,1], repeat=n)).

In this case, set(all_fill([1,1,1,1],2)), set(all_fill([0,0,0,0],2)), set(all_fill([1,1,0,0],2)), set(all_fill([0,0,1,1],2)) will do.

My aim is to solve the problem for n = 12. I am happy to use external libraries if that will help and I expect the time to be exponential in n in the worst case.


Solution

  • I’ve written a small program to write an integer program to be solved by CPLEX or another MIP solver. Below it is a solution for n=12.


    from collections import defaultdict
    from itertools import product, combinations
    
    def all_fill(source, num):
        output_len = (len(source) + num)
        for where in combinations(range(output_len), len(source)):
            poss = ([[0, 1]] * output_len)
            for (w, s) in zip(where, source):
                poss[w] = [s]
            for tup in product(*poss):
                (yield tup)
    
    def variable_name(seq):
        return ('x' + ''.join((str(s) for s in seq)))
    n = 12
    shortn = ((2 * n) // 3)
    x = (n // 3)
    all_seqs = list(product([0, 1], repeat=shortn))
    hit_sets = defaultdict(set)
    for seq in all_seqs:
        for fill in all_fill(seq, x):
            hit_sets[fill].add(seq)
    print('Minimize')
    print(' + '.join((variable_name(seq) for seq in all_seqs)))
    print('Subject To')
    for (fill, seqs) in hit_sets.items():
        print(' + '.join((variable_name(seq) for seq in seqs)), '>=', 1)
    print('Binary')
    for seq in all_seqs:
        print(variable_name(seq))
    print('End')
    

    MIP - Integer optimal solution:  Objective =  1.0000000000e+01
    Solution time =    7.66 sec.  Iterations = 47411  Nodes = 337
    
    CPLEX> Incumbent solution
    Variable Name           Solution Value
    x00000000                     1.000000
    x00000111                     1.000000
    x00011110                     1.000000
    x00111011                     1.000000
    x10110001                     1.000000
    x11000100                     1.000000
    x11001110                     1.000000
    x11100001                     1.000000
    x11111000                     1.000000
    x11111111                     1.000000
    All other variables matching '*' are 0.
    CPLEX>