Search code examples
stringalgorithmprefixsuffix

Suffixes of a string that are also prefix of the same string in O(n)


Recently, I encountered a problem on the HackerEarth platform, the main idea for solving the problem was to find which suffixes of a string are also the same string's prefix in linear time(in size of string). For example, in string "abcdzyabc", "abc" is a suffix of the string and it is also its prefix. We need to find all such suffixes in linear time.

Now say I have a boolean array, isSuffixPrefix? of size n i.e. the length of the string str. isSuffixPrefix?[i] is true if the suffix of string str starting at index i i.e. suffix str[i...n-1] is also a prefix of the same string str, it is false otherwise. How can we compute this array in linear time (if possible)?

An example for string s = "aaa":

isSuffixPrefix?[0] = true    // as suffix s[0..2] = "aaa" is also the prefix of string s
isSuffixPrefix?[1] = true    // as suffix s[1..2] = "aa" is also the prefix of string s
isSuffixPrefix?[2] = true    // as suffix s[2..2] = "a" is also the prefix of string s

Solution

  • your problem is not exactly the prefix function cause the prefix function is defined as an array π of length n, where π[i] is the length of the longest proper prefix of the substring s[0…i] which is also a suffix of this substring and not the whole string but you can tweak a bit the function to get what you want. because prefix function is from [0..i] you can reverse the word you want and it would give you the array from [n...i] but also you would need to reverse the array itself:

    
    def pi_prefix_suffix(pattern):
        P = list(pattern)
        m = len(pattern)
        a = [0] * m
        k = 0
    
        for q in range(2, m + 1):
            while k > 0 and P[k] != P[q - 1]:
                k = a[k - 1]
            if P[k] == P[q - 1]:
                k += 1
            a[q - 1] = k
    
        return a;
    def reverse_string(x):
      return x[::-1]
      
    print("abcbabacba")
    print(reverse_string(pi_prefix_suffix(reverse_string("abcbabacba"))));
    
    

    output:

    abcbabacba
    [1, 0, 3, 2, 1, 2, 1, 0, 0, 0]
    

    one last thing, you said that

    isSuffixPrefix?[2] = true // as suffix s[2..2] = "a" is also the prefix of string s

    but this is not true in the formal way cause suffix and prefix can't be the whole word (especially if the word is one letter), by definition it has to be some part of the word, if you would consider the whole word it means that any prefix is also a suffix of the word. (if you want to consider also the last letter just add 1 on the array cause it always true)