Search code examples
pythoncombinationsbrute-forceheuristicsapproximation

Find minimum number of elements in a list that covers specific values


A recruiter wants to form a team with different skills and he wants to pick the minimum number of persons which can cover all the required skills.

N represents number of persons and K is the number of distinct skills that need to be included. list spec_skill = [[1,3],[0,1,2],[0,2,4]] provides information about skills of each person. e.g. person 0 has skills 1 and 3, person 1 has skills 0, 1 and 2 and so on.

The code should outputs the size of the smallest team that recruiter could find (the minimum number of persons) and values indicating the specific IDs of the people to recruit onto the team.

I implemented the code with brute force as below but since some data are more than thousands, it seems I need to be solved with heuristic approaches. In this case it is possible to have approximate answer.

Any suggestion how to solve it with heuristic methods will be appreciated.

N,K = 3,5
spec_skill = [[1,3],[0,1,2],[0,2,4]]

A = list(range(K))
set_a = set(A)

solved = False
for L in range(0, len(spec_skill)+1):
    for subset in itertools.combinations(spec_skill, L):
        s = set(item for sublist in subset for item in sublist)
        if set_a.issubset(s):
            print(str(len(subset)) + '\n' + ' '.join([str(spec_skill.index(item)) for item in subset]))
            solved = True
            break
    if solved: break


Solution

  • Here is my way of doing this. There might be potential optimization possibilities in the code, but the base idea should be understandable.

    import random
    import time
    def man_power(lst, K, iterations=None, period=0):
        """
        Specify a fixed number of iterations
        or a period in seconds to limit the total computation time.
        """
    
        # mapping each sublist into a (sublist, original_index) tuple
        lst2 = [(lst[i], i) for i in range(len(lst))]
        mini_sample = [0]*(len(lst)+1)
        if period<0 or (period == 0 and iterations is None):
            raise AttributeError("You must specify iterations or a positive period")
        
                
        def shuffle_and_pick(lst, iterations):
            mini = [0]*len(lst)
            for _ in range(iterations):
                random.shuffle(lst2)
                skillset = set()
                chosen_ones = []
                idx = 0
                fullset = True
                # Breaks from the loop when all skillsets are found
                while len(skillset) < K:
                    # No need to go further, we didn't find a better combination
                    if len(chosen_ones) >= len(mini):
                        fullset = False
                        break
                    before = len(skillset)
                    skillset.update(lst2[idx][0])
                    after = len(skillset)
                    if after > before:
                        # We append with the orginal index of the sublist
                        chosen_ones.append(lst2[idx][1])
                    idx += 1
                if fullset:
                    mini = chosen_ones.copy()
            return mini
        
        # Estimates how many iterations we can do in the specified period
        if iterations is None:
            t0 = time.perf_counter()
            mini_sample = shuffle_and_pick(lst, 1)
            iterations = int(period / (time.perf_counter() - t0)) - 1
        
        mini_result = shuffle_and_pick(lst, iterations)
        if len(mini_sample)<len(mini_result):
            return mini_sample, len(mini_sample)
        else:
            return mini_result, len(mini_result)