Search code examples
pythonsplitcomparisondifferencecounting

What is the difference between two codes that count words in a text?


Hello I was trying to count words in a text but there's a problem.

If I write code like

def popular_words(text:str, list:list)-> dict:
    text=text.lower()
    splited_text=text.split()
    answer={}

    for word in list:
        answer[word]=splited_text.count(word) 

    return answer

print(popular_words('''
When I was One 
I had just begun 
When I was Two 
I was nearly new''', ['i', 'was', 'three', 'near']))

' the result is {'i': 4, 'was': 3, 'three': 0, 'near': 0}

It's good, but if I write code without splitting the text like

def popular_words(text:str, list:list)-> dict:
    text=text.lower()
    answer={}
    for word in list:
    
        answer[word]=text.count(word)

    return answer


print(popular_words('''
When I was One 
I had just begun 
When I was Two 
I was nearly new''', ['i', 'was', 'three', 'near']))

'

the result is

{'i': 4, 'was': 3, 'three': 0, 'near': 1}

so there's an error in counting 'near'. I think it's because second code also count 'nearly' for 'near' But I can't understand the results are different between first and second code. Could you guys explain the reason the result is different?


Solution

  • Version without split counts substrings in a string. Thus, "near" is found. When you split, you are now looking for strings in a list - meaning match has to be exact, and "near" does not match "nearly" string that is in a list.

    As @MattDMo says, using list (or any other keyword or built-in) as a variable name is a bad idea.