This function takes a string wild
containing '*' and '?' wild cards, and replaces the wildcards with possible chars from a tree database with nodeT *w
. out
holds a temporary string. Each candidates is added to a referenced bst.
void Lexicon::matchRegExpHelper(nodeT *w, string wild, Set<string> &matchSet, string out)
{
if (wild == "") matchSet.add(out);
else {
if (wild[0] != '*' || wild[0] != '?') { //this parses up to the wildcard, earlier versions used a position parameter and looped through the prefix chars recursively
for (int j = 0; j < w->alpha.size(); j++)
if (wild[0] == w->alpha[j].letter) matchRegExpHelper(w->alpha[j].next, wild.substr(1), matchSet, out+=wild[0]);
} else {
for (int i = 0; i < w->alpha.size(); i++) {
if (wild[0] == '?') matchRegExpHelper(w->alpha[i].next, wild.substr(1), matchSet, out+=w->alpha[i].letter);//follow path
else { //logically, wild[0] == '*' must be true
if (ontLength == (wild.length() + out.length())) matchRegExpHelper(w->alpha[i].next, wild.substr(1), matchSet, out+=w->alpha[i].letter); //ontology is full, treat like a '?'
else matchRegExpHelper(w->alpha[i].next, wild.substr(1), matchSet, out+=(w->alpha[i].letter+'*')); //keep adding chars
}
}
}
}
}
When the first wildcard is reached the function starts over - I have tried rewriting this with the for loops, without the loops, and in differing 'prune' approaches. I am missing something basic and suspect this is a backtracking issue. Eventually the stack overflows.
Question: 1) what am I missing conceptually, and 2) how do I fix this function?
version without for loop - the test case is a bit different but similar, I'd have to test it to find it again
else {
if (wild[0] == '?'){
matchRegExpHelper(w, wild, ++pos, matchSet, out);//return and check next path
matchRegExpHelper(w->alpha[pos].next, wild.substr(1), 0, matchSet, out+=w->alpha[pos].letter);//follow path
}
if (wild[0] == '*'){
matchRegExpHelper(w, wild, ++pos, matchSet, out);//return and check next path
if (ontLength == (wild.length() + out.length()))matchRegExpHelper(w->alpha[pos].next, wild.substr(1), 0, matchSet, out+=w->alpha[pos].letter); //ontology is full, treat like a '?'
else matchRegExpHelper(w->alpha[pos].next, wild.substr(1), 0, matchSet, out+=(w->alpha[pos].letter+'*')); //keep adding chars
}
if (wild[0] == w->alpha[pos].letter) matchRegExpHelper(w->alpha[pos].next, wild.substr(1), 0, matchSet, out+=wild[0]);
matchRegExpHelper(w, wild, ++pos, matchSet, out);//check next path
}
for (int i = 0; i < w->alpha.size(); i++) matchRegExpHelper(w->alpha[i].next, wild.substr(1), 0, matchSet, out+=wild[0]);//step over char
the for loop at the end was an attempt to fix the overflow, i thought maybe there was no case for some threads, but I wanted those to prune, so not sure what to do
There were three issues with this code:
@Basile_Starynkevitch noted the importance of using the debugger to follow the path - in this case with VS2008, the stack allowed parsing the steps and noting the path moving back through several functions.
@Picarus correctly noted the error in the terminating condition
These solutions led to finding:
the solution, this void function requires two return;
, one after the termination condition to terminate and another at the end to catch pruned threads. from reading several websites, it seems that a return;
in a void function is unusual, but this case with the recursion and multiple paths, it was required.