Search code examples
stringknuth-morris-pratt

What is the analogy of fall back after mismatch between when using LPS table vs when calculating LPS table itself?


I understand the main part of KMP algorithm relatively well, which is to fall back when mismatch occurs according to LPS table.

However, I always get so confused about the fall back when calculate/create the LPS table itself, that is, what would be the text and pattern, respectively, in that case? could anyone share me some core intuitions behind it to help me better understand it?


Solution

  • After some hard pondering I guess I finally got my light bulb moment.

    The analogy is, when calculating the LPS table, we are effectively trying to match the text, which is the last n+1 characters aka suffix, with the pattern, which is the first n+1 characters aka prefix, where n is the length already matched, and the LPS table for this analogy is the partial LPS created so far.

    Take s="AABAACAABAAB" as an example, when trying to match the last char 'B' with s[5], we are actually trying to match text "AABAAB" with pattern "AABAAC", now mismatch occurs, then, instead of returning back to the very first char, we return to the s[LPS[5-1]], which is 2.