Ruby: Divide array into 2 arrays with the closest possible average

Background: I'm working on a "matchmaking system" for a small multiplayer video game side project. Every player has a rank from 0-10, every team has 4 players. I'm trying to find a good way to balance out the teams so that the average rank of both of them is as close as possible and the match is as fair as possible.

My current, flawed approach looks like this:

def create_teams(players)
    teams = Hash.new{|hash, team| hash[team] = []}

    players.sort_by(&:rank).each_slice(2) do |slice|
        teams[:team1] << slice[0]
        teams[:team2] << slice[1]
    end

    teams
end

This works decently well if the ranks are already pretty similar but it's not a proper solution to this problem. For example, it fails in a situation like this:

require "ostruct"

class Array
    def avg
        sum.fdiv(size)
    end
end

dummy_players = [9, 5, 5, 3, 3, 3, 2, 0].map{|rank| OpenStruct.new(rank: rank)}

teams = create_teams(dummy_players)
teams.each do |team, players|
    ranks = players.map(&:rank)
    puts "#{team} - ranks: #{ranks.inspect}, avg: #{ranks.avg}"
end

This results in pretty unfair teams:

team1 - ranks: [0, 3, 3, 5], avg: 2.75
team2 - ranks: [2, 3, 5, 9], avg: 4.75

Instead, I'd like the teams in this situation to be like this:

team1 - ranks: [0, 3, 3, 9], avg: 3.75
team2 - ranks: [2, 3, 5, 5], avg: 3.75

Solution

If there are n players, where n is an even number, there are

C(n) = n!/((n/2)!(n/2)!)

ways to partition the n players into two teams of n/2 players, where n! equals n-facorial. This is often expressed as the number of ways to choosing n/2 items from a collection of n items.

To obtain the partition that has a mimimum absolute difference in total ranks (and hence, in mean ranks), one would have to enumerate all C(n) partitions. If n = 8, as in this example, C(8) = 70 (see, for example, this online calculator). If, however, n = 16, then C(16) = 12,870 and C(32) = 601,080,390. This gives you an idea of how small n must be in order perform a complete enumeration.

If n is too large to enumerate all combinations you must resort to using a heuristic, or a subjective rule for partitioning the array of ranks. Here are two possibilities:

assign the highest rank element ("rank 1") to team A, assign elements with ranks 2 and 3 to team B, assign elements with ranks 4 and 5 to team A, and so on.
assign elements with ranks 1 and n to team A, elements with ranks 2 and n-1 to team B, and so on.

The trouble with heuristics is evaluating their effectiveness. For this problem, for every heuristic you devise there is an array of ranks for which the heuristic's performance is abysmal. If you know the universe of possible arrays of ranks and have a way of drawing unbiased samples you can evaluate the heuristic statistically. That generally is not possible, however.

Here is how you could examine all partitions. Suppose:

ranks = [3, 3, 0, 2, 5, 9, 3, 5]

Then we may perform the following calculations.

indices = ranks.size.times.to_a
  #=> [0, 1, 2, 3, 4, 5, 6, 7] 
team_a = indices.combination(ranks.size/2).min_by do |combo|
   team_b = indices - combo
   (combo.sum { |i| ranks[i] } - team_b.sum { |i| ranks[i] }).abs
end
  #=> [0, 1, 2, 5]
team_b = indices - team_a
  #=> [3, 4, 6, 7]

See Array#combination and Enumerable#min_by.

We see that team A players have ranks:

arr = ranks.values_at(*team_a)
  #=> [3, 3, 0, 9]

and the sum of those ranks is:

arr.sum
  #=> 15

Similarly, for team B:

arr = ranks.values_at(*team_b)
  #=> [2, 5, 3, 5] 
arr.sum
  #=> 15

See Array#values_at.