I have a string:
str_x = "121001221122010120211122211122222222112222"
I want to find out how many times a given pattern is observed in the string, but the pattern should be seen as flexible:
The pattern I'm looking for is thus:
A pattern satisfying this condition will thus for example be "22211222", but also "2222111222" and "222222221111111111222"
I want to find out how many times this "flexible pattern" is seen in str_x.
The correct answer here is 2 times.
Any ideas how to do this? Thanks a bunch.
EDIT
Given the definition I placed above, the answer of 2 times is actually incorrect, since valid patterns overlap... e.g. "222111222", "2221112222", "22211122222" etc. are all patterns satisfying the objective.
What I want is to find the number of patterns which do not overlap (that is, still 2 times)
You have to use regex to solve your problem: https://docs.python.org/2/library/re.html
The regular expression:
regex = r"2{3,}?1{2,}?2{3,}?"
means = find at least three 2's followed by at least two 1's followed by at least three 2's
notation 2{3,}
means find all at least three 2's
?
means - greedy search - the search that may overlap
If you want to find patterns that do not overlap - just remove ?
import re
regex = r"2{3,}?1{2,}?2{3,}?"
test_str = "121001221122010120211122211122222222112222"
matches = re.finditer(regex, test_str)
for matchNum, match in enumerate(matches):
matchNum = matchNum + 1
print ("Match {matchNum} was found at {start}-{end}: {match}".format(matchNum = matchNum, start = match.start(), end = match.end(), match = match.group()))
print ("total matches: {matches}".format(matches= matchNum))