Search code examples
pythonfileprintingwords

Read a file and print out the number of words with 1, 2, 3, and 4 letter in python


I've tried doing,

def many(filename):
'prints the number of words of length 1, 2, 3, and 4 in file filename'
    infile = open(filename)
    content = infile.read()
    infile.close()

    words = content.split()
    count = 0

    for word in words:
        count += ((len(word) == 1))
        print("Words of length 1: {}".format(str(count)))

        count += (len(word) == 2)
        print("Words of length 2: {}".format(str(count)))

        count += (len(word) == 3)
        print("Words of length 3: {}".format(str(count)))

        count += (len(word) == 4)
        print("Words of length 4: {}".format(str(count)))

But the output just cycles through the print statements 15 times printing 0-15. Any help is appreciated!


Solution

  • Your problem is that you always increment count, for each word:

    for each word
        if the word is of length 1:
            increment count
        if the word is of length 2:
            increment count
        if the word is of length 3:
            increment count
        if the word is of length 4:
            increment count
    

    In reality, you want to increment different counters based on the length of the word. One way to do this is to maintain four separate counters:

    counter1 = 0
    counter2 = 0
    counter3 = 0
    counter4 = 0
    
    for word in words:
        if len(word) == 1:
            counter1 += 1
        if len(word) == 2:
            counter2 += 1
        if len(word) == 3:
            counter3 += 1
        if len(word) == 4:
            counter4 += 1
    

    Of course, this becomes messy when you want to keep track of words of many more lengths (e.g.: "count the number of words of length 1...20" will require you to maintain 20 variables. Imagine what would happen if 20 turned into 100!)

    As another user has pointed out, maintaining an array is the simplest way to do this (you really could do it with a dictionary, too):

    counts = [0, 0, 0, 0]
    for word in words:
        wordLen = len(word)
        countIndex = wordLen - 1  # remember, python indexes lists from 0
        counts[coundIndex] += 1
    
    for i in range(len(counts)):
        print("There are", countIndex[i], "words of length", i+1)  # again, adjusting for that 0-indexing behavior
    

    If you wanted to get a little more concise with your code:

    longestWordLength = 4
    counts = [0]*(longestWordLength+1)
    for word in words:
        counts[len(word)] += 1
    for length, count in enumerate(counts):
        print("There are {} words of length {}".format(count, length))
    

    A slightly cuter option:

    import collections
    
    def many(filename):
        with open(filename) as infile:
            counts = collections.Counter(len(word) for line in infile for word in line.split())
        for length, count in counts.items():
            print("There are {} words of length {}".format(count, length))