Search code examples
pythonsortingfrequencycpu-word

Python Word Frequency Sort


I have been asked to write a program that counts the word in a text file. I was able to count the words and their frequency and store them in a dictionary. Now, I have to write that data into another text file but with decreasing order of frequency. If two words have the same frequency, the word has to be written in alphabetical order into the output text file.

I stored the word and its frequency into a tuple and did the same thing for all word in a file. A list containing tuples which contained (frequency, word).

I used the .sort(reverse = True) to sort out the tuples but that also sorts words with same frequency in reverse alphabetical order.

Ex: If my list is:

L = [(4,"hello"),(2,"zebra"),(2,"apple"),(1,"a"),(1,"the"),(1,"bike")]

Output should be:

hello          4  
apple          2  
zebra          2  
a              1  
bike           1  
the            1  

Solution

  • Here is a 3 liner that solves the problem

    L = [(4,"hello"),(2,"zebra"),(2,"apple"),(1,"a"),(1,"the"),(1,"bike")]
    L = sorted(L, key=lambda x: (-x[0],x[1]))
    for i,j in L:
        print j, i
    

    Output

    hello 4
    apple 2
    zebra 2
    a 1
    bike 1
    the 1
    

    The idea is that you want to sort the first component of the tuple in a different order as the second component. A simple transformation to take this into account is to set the sorting key to (-x[0],x[1]).