Search code examples
pythonstringsubstringcontainstext-segmentation

How to count ocurrences of substings in string from text file - python


I want to count the number of lines on a .txt file were a string contains two sub-strings.

I tried the following:

with open(filename, 'r') as file:
    for line in file:
        wordsList = line.split()
        if any("leads" and "show" in s for s in wordsList):
            repetitions +=1

print "Repetitions: %i"  % (repetitions)

But it doesn't seem to be working as it should. With the following demo input file I got 3 repetitions when it should be 2:

www.google.es/leads/hello/show
www.google.es/world
www.google.com/leads/nyc/oops
www.google.es/leads/la/show
www.google.es/leads/nope/pop
leads.
show

I also tried chaning "any" for "all" but I get even stranger results.


Solution

  • "leads" and "show" in s is interpreted as: "leads" and ("show" in s) because of precedence.

    Python tries to interpret "leads" as a boolean. As it is a non-empty string, it evaluates to True.

    So, your expression is equivalent to "show" in s

    What you mean is: "leads" in s and "show" in s