Search code examples
algorithmgroupingmathematical-optimization

Hackathon team assignment with scheduling constraints and optimization


I'm helping to create teams for a virtual hackathon with around a thousand global participants. At the time of registration, participants will be asked to choose three preferred hacking times from a list of six time-slots (in GMT). I want to form teams of 3 (or 4, where there is a remainder) with the following constraints:

  • Teams must have 3 (or 4) members
  • Each team member must share at least 2 preferred hacking times
  • Each team should ideally have a mix of departments and offices

My input data from the registration survey would look something like this:

| Name     | Department    | Office      | Time Slots  | 
| -------- | --------------| ------------| ------------|
| A        | Engineering   | Bangalore   | 1, 3, 6     |
| B        | Engineering   | SF          | 2, 4, 5     |
| C        | Sales         | Amsterdam   | 1, 6, 2     |
| D        | Engineering   | NYC         | 1, 6, 3     |
| E        | CX            | SF          | 5, 1, 3     |
| F        | Engineering   | SF          | 2, 5, 4     |
| G        | Engineering   | SF          | 1, 6, 3     |
| H        | Product       | Bangalore   | 2, 4, 5     |
| I        | Product       | SF          | 1, 4, 3     |

My desired output would be a csv file like:

| Team Name | Team Members   | Shared Time Slots |
| --------- | -------------- | ------------------|
| Team A    | A, C, G        | 1, 6              |     
| Team B    | B, F, H        | 2, 4              |
| Team C    | D, E, I        | 1, 3              |

Since I'm willing to trade finding the optimal solution for a solution that is simpler to implement, I was considering hill-climbing with random restart based on this post. My questions are:

  • What is the general class of problems that this belongs so I can do more informed research?
  • Is there a better solution than hill-climbing?
  • If hill-climbing is indeed the way to go, how do I represent the hard constraint of shared time-slots in a utility function?

Solution

  • Here's a relatively simple algorithm that should work for all inputs except the most perverse.

    1. There are 15 combinations of two timeslots {1, 2, 3, 4, 5, 6}. For each combination construct a bin of participants (so 15 bins). Loop over all participants and add them to each bin that they are compatible with (so each participant can be in multiple bins).

      For each bin also initialize an empty list containing the groups constructed from that bin.

    2. Select the bin with the shortest list of groups, and in case of tie that has the least participants in it (ignoring bins with < 3 participants in them). Using any greedy soft-constraint algorithm pick a 'diverse' (in terms of departments/offices) group of 3 participants from this bin.

      Add this group to the list of groups for the selected bin, and remove the selected participants from all bins.

      Repeat this step until all bins have 2 or less members left (or until a constructed group wouldn't be diverse enough, although this risks a higher failure chance).

    3. At this point you are left with only a few ungrouped participants. Add them one by one to any compatible size three group in some greedy way (e.g. simply choose the group that maximizes diversity score).