python linear-programming pyomo gekko constraint-programming

How to find the best possible team lineup (in swimming)

I currently have a fairly simple algorithm that tries to build the best possible team-lineup given some constrains:

There is a finite list of events which needs to be filled with swimmers

Event 1: 50 Free
Event 2: 100 Free
...

In each event, a team can send up to N (MaxSwimmersPerTeam) swimmers.
A single swimmer can participate in multiple events, but is limited by M (MaxEntriesPerSwimmer).
I have a list of swimmer best times for every event they have participated. (Note: Not every swimmer swims every possible event, it's a subset with size close to M).

team_1_best_times = [
    {"time_id": 1, "swimmer_id": 1, "event_id": 1, "time": 22.00, "rating": 900.00},
    {"time_id": 2, "swimmer_id": 1, "event_id": 2, "time": 44.00, "rating": 800.00},
    {"time_id": 3, "swimmer_id": 2, "event_id": 1, "time": 22.10, "rating": 890.00},
    {"time_id": 4, "swimmer_id": 2, "event_id": 2, "time": 46.00, "rating": 750.00},
]

The rating key (Higher is Better) gives me the ability to compare times across different events.

The best-possible line-up would be the lineup with max average rating across all chosen times satisfying N and M

My current approach is to iterate over all the times sorted by rating and fill the line-up until N and M are satisfied. For example given N=1 and M=1 my algorithm would:

Put Time1 (of Swimmer1) with 900 rating into Event1
Skip Time3 (Event1-Swimmer2) with 890 rating - since we already have 1 = N other swimmer (Swimmer1) in Event1, thus MaxSwimmersPerTeam has been reached.
Skip Time2 (Event2-Swimmer1) with 800 rating - since Swimmer1 has already been put in 1 = M other Event (Event1), thus (MaxEntriesPerSwimmer) has been reached.
Put Time4 (Swimmer2) with 750 rating into Event2.

Now the average rating of this team-lineup Event1-Swimmer1 (900) and Event2-Swimmer2 (750) would be: (900+750)/2 = 825.

And this is the simplest possible example showing where my approach falls short. What a smart coach could do, would be to put Swimmer1 into Event2 (800) and Swimmer2 into Event1 (890) reaching a higher avg rating of: (890+800)/2 = 845.

I've tried to research the problem a little bit, found libraries like python-constraint , gekko, pyomo but I still cannot figure out how to explain the problem using their tools.

Solution

Swimmer allocation is related to schedule optimization or set covering algorithms. Heuristic methods can often get close to the optimal solution with a simple sort & select method. Optimization methods are typically slower, but can find a more optimal solution. Below is a simple Mixed Integer Linear Program (MINLP) that solves a simplified version of your swimmer allocation problem in gekko.

from gekko import GEKKO

# Initialize the model
model = GEKKO(remote=False)

# Example mock data (replace with actual data)
swimmers = ['Swimmer1', 'Swimmer2', 'Swimmer3', 'Swimmer4', 'Swimmer5']
events = ['100Y Free', '200Y IM', '200Y Free']
# for each swimmer-event combination
rating = {('Swimmer1', '100Y Free'): 800,
         ('Swimmer1', '200Y IM'):    805,
         ('Swimmer1', '200Y Free'):  750,
         ('Swimmer2', '100Y Free'):  600,
         ('Swimmer2', '200Y IM'):    650,
         ('Swimmer2', '200Y Free'):  620,
         ('Swimmer3', '100Y Free'):  815,
         ('Swimmer3', '200Y IM'):    675,
         ('Swimmer3', '200Y Free'):  300,
         ('Swimmer4', '100Y Free'):  705,
         ('Swimmer4', '200Y IM'):    820,
         ('Swimmer4', '200Y Free'):  802,
         ('Swimmer5', '100Y Free'):  713,
         ('Swimmer5', '200Y IM'):    715,
         ('Swimmer5', '200Y Free'):  350,
         }

# Variables
assignments = {}
for swimmer in swimmers:
    for event in events:
        assignments[(swimmer, event)] = model.Var(lb=0, ub=1, integer=True)

# Objective Function: Maximize rating for each event
for event in events:
    total_rating = sum([rating[(swimmer,event)]*assignments[(swimmer, event)] for swimmer in swimmers])
    model.Maximize(total_rating)

# Constraint: min 1 event, max 3 events per swimmer
for swimmer in swimmers:
    model.Equation(model.sum([assignments[(swimmer, event)] for event in events]) <= 3)
    model.Equation(model.sum([assignments[(swimmer, event)] for event in events]) >= 1)

# Constraint: 3 swimmers per individual event
for event in events:
    model.Equation(model.sum([assignments[(swimmer, event)] for swimmer in swimmers]) == 3)

# Constraint 4 for each relay
# Add constraints for relay events

# Solve the model
model.options.SOLVER = 1
model.solve(disp=True)

# Output the assignments
for swimmer in swimmers:
    for event in events:
        if assignments[(swimmer, event)].value[0] >= 0.1:
            print(f"{swimmer} participates in {event}")

The solution is:

Swimmer1 participates in 100Y Free
Swimmer1 participates in 200Y IM
Swimmer1 participates in 200Y Free
Swimmer2 participates in 200Y Free
Swimmer3 participates in 100Y Free
Swimmer4 participates in 200Y IM
Swimmer4 participates in 200Y Free
Swimmer5 participates in 100Y Free
Swimmer5 participates in 200Y IM

I recommend a first look at heuristic methods as a more flexible / configurable solution. Heuristic methods need lower computing resources to get a quick solution versus a more optimal solution. If you do want to use MILP, a heuristic method is a way to initialize the solver with a good initial guess.