I am going through KMP algorithm. Even though this algorithm is clear to understand, I have one doubt here.
prefix table algorithm:
void prefixTable(char p[], int m){
int i=1, j=0, F[0] = 0;
while(i<m){
if(p[i]==p[j]){
F[i]=j+1;
i++;
j++;
}else if(j>0){
j=F[j-1];
}else{
F[i]=0;
i++;
}
}
}
As shown in above image step 5, i=5, j=3, j=F[j-1] is executed as j>0 .
why should we take F[j-1]? why can't we use directly F[0]? How does it make sure correctness of algorithm?
j is the position in the pattern.
if the pattern is processed up to a certain position > 0, then we cannot shift the pattern to the first (0) position if the pattern contains a prefix of itself.
Applied to your example: The pattern is ababaca
. Try to find it in a text abababaca
:
ababa|baca
where the pattern is ababa|c
|ababac
which will never match baca
(note that i will not be changed)aba|bac
which will match baca
ababac|
and the text in state abababac|a
and it gets clear that the found pattern is ab[ababac]a