Search code examples
pythonlistranking

Is there a good algorithm to reconstruct a global ranked list from several partially ranked lists?


I have a list of objects which are being reviewed by a team of judges. Each judge only sees a subset of the objects, and each judge ranks the objects from best to worst. Each object is ranked by at least two judges, and there is possibility of the judges to disagree (i.e., this is a noisy judging process). No judge sees all the objects.

Is there a good algorithm to compile the "best" global list of rankings given the collection of partial ranked lists from all the judges?

An example in Python (treat it as pseudocode):

# Let's say there are six things and we want to rank them.

# There are ... four judges, each of whom judges three things, 
# so each thing gets judged twice.

items = ['a', 'b', 'c', 'd', 'e', 'f']
j1_rank = ['a', 'c', 'e']
j2_rank = ['b', 'd', 'f']
j3_rank = ['a', 'b', 'c']
j4_rank = ['d', 'e', 'f']
# these are ranked low to high

# the goal is - can we combine together ranks j1-j4 to reproduce a master ranked list
expected_ranked_list = ['a', 'b', 'c', 'd', 'e', 'f']

I've taken a look at rank aggregation algorithms, but most of the online materials I've found are very technical and/or mathematical or in scientific literature with much jargon; many of these are more about ranked-choice voting (e.g. for political candidates), which is not a similar problem to what I'm facing.

Edit: As pointed out by @JoshGordon and @SurajShourie, I believe another acceptable expected solution would be ['a', 'b', 'd', 'c', 'e', 'f'].


Solution

  • I think I've come up with a (brute-force) solution to my toy example. I believe I implicitly was looking for a Kemeny-Young solver, i.e. one that minimized the number of pairwise disagreements between the set of partial rankings and the solved global ranking.

    Here's my solution:

    j1_rank = ['a', 'c', 'e']
    j2_rank = ['b', 'd', 'f']
    j3_rank = ['a', 'b', 'c']
    j4_rank = ['d', 'e', 'f']
    
    # the goal is - can we combine together ranks j1-j4 
    # to reproduce a master ranked list
    expected_ranked_list = ['a', 'b', 'c', 'd', 'e', 'f']
    
    def Kemeny_metric(global_rank, partial_ranks):
       
       score = 0
       
       # for each pair in the global rank
       for i, gi in enumerate(global_rank):
           
           for j, gj in enumerate(global_rank[i+1:]):
               
               # check each pair in each partial rank
               for k, pr in enumerate(partial_ranks):
                   
                   try:
                       if pr.index(gi) > pr.index(gj):
                           #** Pair {gi}, {gj} are violated in partial ranking {k}
                           score += 1
                       else:
                           #   Pair {gi}, {gj} are well-ordered in partial ranking {k}
                           pass
                   except ValueError:
                       #   Pair {gi}, {gj} are not both in partial ranking {k}
                       pass
       return score
    
    from itertools import permutations
    
    for i, perm in enumerate(permutations(expected_ranked_list)):
    
       score = Kemeny_metric(perm, [j1_rank, j2_rank, j3_rank, j4_rank])
    
       if score==0:
    
           print(i, perm, score)
    

    This script yields:

    # >> 0 ('a', 'b', 'c', 'd', 'e', 'f') 0
    # >> 6 ('a', 'b', 'd', 'c', 'e', 'f') 0
    

    which matches my intuition that 'b' should always score above 'd' but that 'c' and 'd' should have a tie.

    My next task is to find an algorithm to solve (or approximately solve) this problem in the case for larger lists of items. I'm beginning my search here: Experiments with Kemeny ranking: What works when?

    I'll leave this up for a couple days for comment before I mark it as "answered". Thanks to everyone who left comments or answers!