I am reading the topcoder tutorial on KMP algorithm and I fail to understand the following part of the text:
Given a string (a quite long one), find all its proper suffixes that are also prefixes of it. All we have to do is just to calculate the "failure function" of the given string and using the information stored in it to print the answer.
I know how to calculate the failure function and for a string abacababa
I get the following array [0, 0, 1, 0, 1, 2, 3, 2, 3]
. The problem is that I can not figure out how it can help me to find
all its proper suffixes that are also prefixes of it
which based on my understanding suppose to be a
and aba
.
The longest proper suffix that is also a prefix is a prefix of length p[n - 1]
. The next one is the longest suffix of this suffix which is also a prefix. It is exactly a prefix of length p[p[n - 1] - 1]
. We keep repeating it until we get an empty prefix.
For example, for that abacaba
string it goes this way:
The longest proper suffix which is also a prefix is aba
(because p[n - 1]
is 3
).
The longest proper suffix which is also a prefix of aba
is a
(because p[2]
is 1
).
There is no such suffix for a
(p[0] = 0
), so we are done.
The code looks like this:
cur = p[n - 1]
while cur != 0:
print(s[0 ... cur - 1])
cur = p[cur - 1]