Search code examples
algorithmstring-matchingboyer-moore

Boyer-Moore String Matching - Good suffix shift


I understand the bad-symbol table. In the good suffix table, shouldn't the distance be calculated as the distance from rightmost occurrence of the pattern to the end of the pattern text ? In that case, shouldn't the below table have all distances (d2) as 1 (except for the last entry which would be 5), because the same pattern would be available to the immediate left of it?

enter image description here

On similar terms, never understood the below table as well. Any help?

enter image description here

Reference:

Question - Page 6, question 7.


Answer - Page 11


The design and analysis of computer algorithms- Anany Levitin ( https://umutzafer.files.wordpress.com/2012/01/solu7.pdf )

Text - The design and analysis of computer algorithms- Anany Levitin ( Page 263 )


Solution

  • The good suffix rule is looking for an instance of the already-matched suffix where the previous character differs. For example, in the case of the "good suffix" 00 in your first table, the shift seeks an instance of 00 not preceded by a 0.

    Why? Because we know that the 0 preceding the "good suffix" didn't match; otherwise we wouldn't be trying to shift.

    In the second table, no such instance can be found. So instead, we look for a plausible prefix of the pattern which matches a suffix of the "good suffix".