I'm a newbie. I've been learning about algorithms for two months now. I'm generally doing okay but I'm not good at understanding search algorithms nor implementing them. I'm particularly stuck on this pattern search algorithm, Reverse Factor. I've been researching it for a week now but I still don't completely understand it, let alone implement it. I don't have anyone I could ask but I don't want to skip any algorithms. So far I've found this algorithm. But I don't understand it well. I'm also not a native speaker. Can you help me?
the purpose is "search a pattern p in string t".
Algorithm RF /* reverse factor string matching */
/* denote t[i + j.. i + m] by x;
it is the last-scanned part of the text */
i:= 0;
while i _< n - m do
{
j:= m;
while j > 1 and x ϵ FACT(p)
do j:=j- 1;
/* in fact, we check the equivalent condition x^R ϵ FACT(p^R) */
if x = p then
report a match at position i;
shift := RF shift[x];
i := i + shift;
}
end.
Fact(p) is the set of all factors (substrings) of p.
Thank you in advance.
I will make a try:
i:= 0;
while i _< n - m do //start at character 0
{
j:= m; //start at character i + m (the potentially last character)
whilej > 1 and x ϵ FACT(p)
do j:=j- 1; //step back as long as t[i+j,i+m] is a substring of the pattern p
/* in fact, we check the equivalent condition x^R ϵ FACT(p^R) */
if x = p then // x=[i+0, i+m] == p
report a match at position i;
shift := RF shift[x]; // look up the number of chars to advance
i := i + shift; // advance
}
The construction of the array shift
is quite hard. I cannot remember how this is done. However I could say what one would find at shift[x]
.
shift[x] = the number of save character shifts such that the next search does not miss a match.
Example: Having a string abcabcdab
and a pattern bcd
(| is i+m, * is i+j
):
abc*|abcdab // start with i=0,j=3
ab*c|abcdab // c is a factor => continue
a*bc|abcdab // bc is a factor => continue
*abc|abcdab // abc is not a factor => shift = shift[bc] = 1
abca*|bcdab
abc*a|bcdab // a is not a factor => shift = shift[] = 3
abcabcd*|ab
abcabc*d|ab // d is a factor => continue
abcab*cd|ab // cd is a factor => continue
abca*bcd|ab // bcd is a factor and j = 0 => report match
See here for an example for debugging in Java. It is not as simple as your pseudocode, but you may debug it for better understanding.