python optimization linear-programming pulp

Minimise set size in task assignation problem

I have to create a solution for assigning tasks to users according to some rules and I wanted to give linear programming a try.

I have a list of tasks that require a certain skill and belong to a specific team, and I have a list of available users, their assigned team for the day, and their skill set:

# Creating dummies
task = pd.DataFrame({
    'id': [n for n in range(25)],
    'skill': [random.randint(0,3) for _ in range(25)]
})
task['team'] = task.apply(lambda row: 'A' for row.skill in (1, 2) else 'B', axis=1)

user_list = pd.DataFrame({
    'user': [''.join(random.sample(string.ascii_lowercase, 4)) for _ in range(10)],
    'team': [random.choice(['A', 'B']) for _ in range(10)]
})

user_skill = {user_list['user'][k]: random.sample(range(5), 3) for k in range(len(user_list))}

The constraints I have to implement are the following:

All tasks must be assigned
A task can only be assigned to one user
A user can not do a task for which he or she isn't skilled
A user can not do a task for another team than his or hers
The amount of tasks per user should be as low as possible inside a team

I struggled a lot to write this in PuLP but thanks to this post I managed to get some results.

# Create the problem
task_assignment = pulp.LpProblem('task_assignment', pulp.LpMaximize)

# Create model vars
pair = pulp.LpVariable.dicts("Pair", (user_list.user, task.id), cat=pulp.LpBinary)
task_covered = pulp.LpVariable.dicts('Covered', task.id, cat=pulp.LpBinary)

# Set objective
task_assignment += pulp.lpSum(task_covered[t] for t in task.id) + \
        0.05 * pulp.lpSum(pair[u][t] for u in user_list.user for t in task.id)

# Constraint

# A task can only be done by one user
for t in task.id:
    task_assignment+= pulp.lpSum([pair[u][t] for u in user_list.user]) <= 1

# A user must be skilled for the task
for u in user_list.user:
    for t in task.id:
        if not task[task.id == t].skill.values[0] in user_skill[u]:
            task_assignment += pair[u][t] == 0

# A user can not do a task for another team
for u in user_list.user:
    for t in task.id:
        if not (task[task.id == t].team.values[0] == user_list[user_list.user == u].team.values[0]):
            task_assignment+= pair[u][t] == 0

task_assignment.solve()

My problem is that I have absolutely no idea on how to implement the last constraint (i.e. the amount of tasks per user should be as low as possible inside a team)

Does someone have any idea how to do this ?

Solution

First of all, your dummy data set isn't valid python code since it misses some brackets.

One way to minimize the number of tasks per user inside a team is to minimize the maximal number of tasks per user inside a team. For this end, we just include a non-negative variable eps for each team and add the following constraints:

teams = user_list.team.unique()

# Create the problem
task_assignment = pulp.LpProblem('task_assignment', pulp.LpMaximize)

# Create model vars
pair = pulp.LpVariable.dicts("Pair", (user_list.user, task.id), cat=pulp.LpBinary)
task_covered = pulp.LpVariable.dicts('Covered', task.id, cat=pulp.LpBinary)
eps = pulp.LpVariable.dicts("eps", teams, cat=pulp.LpContinuous, lowBound=0.0)

# Set objective
task_assignment += pulp.lpSum(task_covered[t] for t in task.id) + \
    0.05 * pulp.lpSum(pair[u][t] for u in user_list.user for t in task.id) - \
        0.01 * pulp.lpSum(eps[team] for team in teams)

# Constraint
# ... your other constraints here ...

# the amount of tasks per user should be as low as possible inside a time
for team in teams:
    # for all users in the current team
    for u in user_list[user_list["team"] == team].user:
        task_assignment += pulp.lpSum(pair[u][t] for t in task.id) <= eps[team]
    

task_assignment.solve()

Because you have a maximization problem, we need to subtract the sum of the eps in the objective.