Search code examples
algorithmknuth-morris-pratt

Explanation about the property of KMP failure function


I am reading the topcoder tutorial on KMP algorithm and I fail to understand the following part of the text:

Given a string (a quite long one), find all its proper suffixes that are also prefixes of it. All we have to do is just to calculate the "failure function" of the given string and using the information stored in it to print the answer.

I know how to calculate the failure function and for a string abacababa I get the following array [0, 0, 1, 0, 1, 2, 3, 2, 3]. The problem is that I can not figure out how it can help me to find

all its proper suffixes that are also prefixes of it

which based on my understanding suppose to be a and aba.


Solution

  • The longest proper suffix that is also a prefix is a prefix of length p[n - 1]. The next one is the longest suffix of this suffix which is also a prefix. It is exactly a prefix of length p[p[n - 1] - 1]. We keep repeating it until we get an empty prefix.

    For example, for that abacaba string it goes this way:

    1. The longest proper suffix which is also a prefix is aba(because p[n - 1] is 3).

    2. The longest proper suffix which is also a prefix of aba is a(because p[2] is 1).

    3. There is no such suffix for a(p[0] = 0), so we are done.

    The code looks like this:

    cur = p[n - 1]
    while cur != 0:
         print(s[0 ... cur - 1])
         cur = p[cur - 1]