Search code examples
pythonpython-3.xlambdaheappython-collections

key function for heapq.nlargest()


I have a dictionary with {key: count}, say status_count = {'MANAGEMENT ANALYSTS': 13859, 'COMPUTER PROGRAMMERS': 72112} and I am trying to write a key function for heapq.nlargest() that sorts based on count and if there are ties I have to sort based on alphabetical order(a-z) of keys. I have to use heapq.nlargest because of very large N and small k = 10.

This is what I got until now,

top_k_results = heapq.nlargest(args.top_k, status_count.items(), key=lambda item: (item[1], item[0])) But, this would be incorrect in case of breaking ties with alphabetical order. Please help!


Solution

  • Simplest may be to switch to heapq.nsmallest and redefine your sort key:

    from heapq import nsmallest
    
    def sort_key(x):
        return -x[1], x[0]
    
    top_k_results = nsmallest(args.top_k, status_count.items(), key=sort_key)
    

    Alternatively, you can use ord and take the negative for ascending order:

    from heapq import nlargest
    
    def sort_key(x):
        return x[1], [-ord(i) for i in x[0]]
    
    top_k_results = nlargest(args.top_k, status_count.items(), key=sort_key)
    

    Remember to use str.casefold if you need to normalize the case of your string.