Search code examples
pythonarraysmongodbloopsload-balancing

Distribute list of tasks evenly among list of employees


I want to distribute a list of alerts/notifications evenly amongst a list of active employees to try ensure that everyone has a similar number of responsabilities. I currently recieve a list of the alerts to be assigned and the active users plus the number of activities each one has currently assigned. As of right now I have only managed to achieve the distribution in the same order as the for loop iterates, but I can't think of a way to evenly do this.

For example:

  • e1 has 1 alert
  • e2 has 1 alert
  • e3 has 3 alerts

I want to assign tasks to e1 and e2 until they reach 3 and then the loop can continue as normal until it runs out of alerts.

Employee is a dictionary containing {"id": "oid", "alerts": int} unassignedAlerts is just a list of ObjectIds

Code:

for alert in unassignedAlerts:
                for employee in activeEmployees:
                    if employee ["alerts"] > maxAlerts:
                        maxAlerts = employee["alerts"]
                    elif employee ["_id"] in activeUsers:
                        operations.append(
                            UpdateOne({"_id": alert}, {"$set": {"employeeResponsible": employee["_id"], "status": "assigned"}})
                        )
                        employee["alerts"] += 1

Any ideas are appreciated.


Solution

  • This is a classic load balancing problem. The brute force method is to repeatedly find the employee who currently has the minimum number of alerts:

    for alert in unassigned_alerts:
        assigned_employee = min(active_employees, key=lambda e: e["alerts"])
        assigned_employee["alerts"] += 1
        ...
    

    This runs in O(m*n) time, where m is the number of unassigned alerts and n is the number of active employees.

    You can improve this to O(m*log(n)) by using a min-heap data structure to find the optimal employee. Python offers a built-in min-heap implementation in the heapq module. To use it, you could store the employees as a list of pairs (tuples of length 2), where the first element is the number of alerts and the second element is the dictionary containing other information:

    import heapq
    
    
    active_employees = [
        (3, {"name": "John", "id": 1}),
        (5, {"name": "Jane", "id": 2})
    ]
    heapq.heapify(active_employees)
    
    for alert in unassigned_alerts:
        num_alerts, assigned_employee = heapq.heappop(active_employees)
        heapq.heappush(active_employees, (num_alerts+1, assigned_employee))
        ...
    

    Finally, I should mention that you do not need to always assign to the employee with minimum load. You can achieve a very good balance (with high probability) by simply picking two employees at random and then assigning to the one with less work. See this excellent blog post for more details.