Search code examples
python-3.xlistloopstext-files

Count the number of times a word is repeated in a text file


I need to write a program that prompts for the name of a text file and prints the words with the maximum and minimum frequency, along with their frequency (separated by a space).

This is my text

I am Sam
Sam I am
That Sam-I-am
That Sam-I-am
I do not like
that Sam-I-am
Do you like
green eggs and ham
I do not like them
Sam-I-am
I do not like
green eggs and ham

Code:

file = open(fname,'r')
dict1 = []
for line in file:
  line = line.lower()
  x = line.split(' ')
  if x in dict1:
    dict1[x] += 1 
  else:
    dict1[x] = 1 

Then I wanted to iterate over the keys and values and find out which one was the max and min frequency however up to that point my console says

TypeError: list indices must be integers or slices, not list

I don't know what that means either.

For this problem the expected result is:

Max frequency: i 5
Min frequency: you 1

Solution

  • you are using a list instead of a dictionary to store the word frequencies. You can't use a list to store key-value pairs like this, you need to use a dictionary instead. Here is how you could modify your code to use a dictionary to store the word frequencies:

    file = open(fname,'r')
    word_frequencies = {} # use a dictionary to store the word frequencies
    
    for line in file:
        line = line.lower()
        words = line.split(' ')
        for word in words:
            if word in word_frequencies:
                word_frequencies[word] += 1
            else:
                word_frequencies[word] = 1
    

    Then to iterate over the keys and find the min and max frequency

    # iterate over the keys and values in the word_frequencies dictionary
    # and find the word with the max and min frequency
    max_word = None
    min_word = None
    max_frequency = 0
    min_frequency = float('inf')
    
    for word, frequency in word_frequencies.items():
        if frequency > max_frequency:
            max_word = word
            max_frequency = frequency
        if frequency < min_frequency:
            min_word = word
            min_frequency = frequency
    

    Print the results

    print("Max frequency:", max_word, max_frequency)
    print("Min frequency:", min_word, min_frequency)