I am trying to find patterns in a sequence of integers. I have found the KNUTH-MORRIS-PRATT (KMP) in this link.
I've fed the function a 'pattern' to find in a 'text.' But the output of the KMP function is an object. I need the indices for the instances of the pattern in the text. I tried checking out the attributes of the object by typing dot and pressing tab but nothing is there. How can I get the indices?
Edit
Code:
> # Knuth-Morris-Pratt string matching
> # David Eppstein, UC Irvine, 1 Mar 2002
>
> from __future__ import generators
>
> def KnuthMorrisPratt(text, pattern):
>
> '''Yields all starting positions of copies of the pattern in the text. Calling conventions are similar to string.find, but its
> arguments can be lists or iterators, not just strings, it returns all
> matches, not just the first one, and it does not need the whole text
> in memory at once. Whenever it yields, it will have read the text
> exactly up to and including the match that caused the yield.'''
>
> # allow indexing into pattern and protect against change during yield
> pattern = list(pattern)
>
> # build table of shift amounts
> shifts = [1] * (len(pattern) + 1)
> shift = 1
> for pos in range(len(pattern)):
> while shift <= pos and pattern[pos] != pattern[pos-shift]:
> shift += shifts[pos-shift]
> shifts[pos+1] = shift
>
> # do the actual search
> startPos = 0
> matchLen = 0
> for c in text:
> while matchLen == len(pattern) or \
> matchLen >= 0 and pattern[matchLen] != c:
> startPos += shifts[matchLen]
> matchLen -= shifts[matchLen]
> matchLen += 1
> if matchLen == len(pattern):
> yield startPos
Sample Text: [1, 2, 2, 3, 3, 2, 4, 5, 2, 2, 3, 2]
Sample Pattern: [2, 2, 3]
Sample output: [1, 8]
You aren't returning anything from the function and you need to loop through the iterator to get the indices by using comprehension
. Rewrite it this way:
from __future__ import generators
def KnuthMorrisPratt(text, pattern):
pattern = list(pattern)
# build table of shift amounts
shifts = [1] * (len(pattern) + 1)
shift = 1
for pos in range(len(pattern)):
while shift <= pos and pattern[pos] != pattern[pos-shift]:
shift += shifts[pos-shift]
shifts[pos+1] = shift
# do the actual search
startPos = 0
matchLen = 0
for c in text:
while matchLen == len(pattern) or matchLen >= 0 and pattern[matchLen] != c:
startPos += shifts[matchLen]
matchLen -= shifts[matchLen]
matchLen += 1
if matchLen == len(pattern):
yield startPos
return matchLen
t= [1, 2, 2, 3, 3, 2, 4, 5, 2, 2, 3, 2]
p= [2, 2, 3]
[k for k in KnuthMorrisPratt(t,p)]
[1, 8]