Search code examples
javasortingrandomcomparator

How can I sort using a comparator with ties broken randomly?


I'm building a program that simulates a sports league, and, oftentimes, I want to sort teams by points, goal differential, etc. to determine things like which team gets promoted or which team gets preferential seeding in a tournament. But, sometimes, teams will be tied on all the criteria that I want to use, and I would like those ties to be broken randomly. I was previously using teams' ids (which are unique ints) to break ties, but I'd like to make it so that no team gets consistent preferential treatment in this sorting. I know that if I just let Java decide which of the tied teams will be placed in front of the other, it'll be very unpredictable, but I don't trust it to be actually random, and I'm worried that it'd inadvertently give some teams an advantage.

I've been using Collections.sort() with comparators to do my sorting, and I recently just went for it, and switched my comparators to use a random number as the final tiebreaker rather than the team ids. Here is the comparator in question:

public static final Comparator<Team> pointsGDRand = new Comparator<Team>() {
    public int compare(Team a, Team b) {
        int result = b.getStats().getPts() - a.getStats().getPts();
        if (result == 0) {
            result = b.getStats().getGD() - a.getStats().getGD();
            if (result == 0) {
                result = R.boolNum();
            }
        }
        return result;
    }
};

I think the getter functions are self explanatory, but here's that random function:

public static int boolNum() {
    return r.nextBoolean() ? 1 : -1;
}

But, just a bit ago (it look a long time for this error to pop up), I got this error:

Exception in thread "main" java.lang.IllegalArgumentException: Comparison method violates its general contract!

I've read a little bit, and I think that the problem is likely that I had some instance where it figured out that A > B and B > C, but also A < C. This sheds some light on why it took so long for this error to pop up (it did maybe 20,000 sorts before it popped up with this error), because 2 teams getting the exact same score on all criteria seems like a pretty rare occurrence to me, much less 3 or more.

I'd rather not write my own custom sort, and I'd also rather not shuffle the list in question before each time I sort it (I feel like that'd be inefficient, but I'm confident it would work).

So, how can I make a comparator that breaks ties randomly (such that there's no risk of a team winning tiebreaks disproportionately) without risking that same error again?


Solution

  • We need to compare tied teams using an extra attribute. This attribute needs to be constant for the life of a Comparator, but random from one Comparator to the next.

    Create a new comparator each time you need to sort, and make sure that it behaves in a predictable, but random way:

    class PointsGDRandComparator implements Comparator<Team> {
        // each comparator has a different seed
        private final long seed = new Random().nextLong();
    
        public int compare(Team a, Team b) {
            int result = b.getStats().getPts() - a.getStats().getPts();
            if (result == 0) {
                // every time we compare a particular team, the same random number will be produced
                long aValue = new Random(seed * a.getTeamId()).nextLong();
                long bValue = new Random(seed * b.getTeamId()).nextLong();
                return Long.compare(aValue, bValue);
            }
            return result;
        }
    }
    

    So each Comparator will have a different random seed, but each random decision will be the same, as we create new Random instances with a seed based on the Comparators seed and the ids of the two teams.

    We are still sorting based on a function of team id (because we need to have a symmetric and transitive relation ship between comparing teams).

    Two Random instances created using the same seed will produce the same sequence of random numbers. So when ever we compare the team with id 45 with another team, we'll create a Random instance with the seed seed * 45 (where seed is constant for the lifetime of the Comparator), so the first time we call nextLong() on any instance of Random created with that seed we will always get the same number.

    Here's a less tricksy way of doing the same thing. It assumes that Team has correct hashCode and equals implementations, so that we can use it as a key in a Map:

    class PointsGDRandComparator implements Comparator<Team> {
        private final Map<Team,Long> tiebreakers = new HashMap<>();
        
        public PointsGDRandComparator(List<Team> teams) {
            Random r = new Random();
            for (Team t : teams) {
                tiebreakers.put(t, r.nextLong());
            }
        }
    
        public int compare(Team a, Team b) {
            int result = b.getStats().getPts() - a.getStats().getPts();
            if (result == 0) {
                return Long.compare(tiebreakers.get(a), tiebreakers.get(b));
            }
            return result;
        }
    }
    

    The above solution is expensive, because it creates a Map the size of the teams list, and creates a random number for each team. Unless you have thousands of teams that won't matter, but we can make it cheaper by populating the Map lazily:

    class PointsGDRandComparator2 implements Comparator<Team> {
        private final Map<Team,Long> tiebreakers = new HashMap<>();
        private final Random r = new Random();
    
        public int compare(Team a, Team b) {
            int result = b.getStats().getPts() - a.getStats().getPts();
            if (result == 0) {
                return Long.compare(
                        tiebreakers.computeIfAbsent(a, t -> r.nextLong()), 
                        tiebreakers.computeIfAbsent(b, t -> r.nextLong())
                );
            }
            return result;
        }
    }