Search code examples
c++repeatsuffix-tree

Generalising Boolean Functions for more elegant code


I've always been of the belief that if you're copying and pasting code, then there's a more elegant solution. I'm currently implementing a string suffix trie in C++ and have two practically identical functions which only differ in their return statement.

The first function checks if a substring is present, the second if it is a suffix (all strings have the character # added to the end).

substring(string S)

bool Trie::substring(string S){
    Node* currentNode = &nodes[0];                          //root
    for(unsigned int i = 0; i < S.length(); i++){
        int edgeIndex = currentNode->childLoc(S.at(i));
        if(edgeIndex == -1)
            return false;
        else currentNode = currentNode->getEdge(edgeIndex)->getTo();
    }
    return true;
}

suffix(string S)

bool Trie::suffix(string S){
    Node* currentNode = &nodes[0];                          //root
    for(unsigned int i = 0; i < S.length(); i++){
        int edgeIndex = currentNode->childLoc(S.at(i));
        if(edgeIndex == -1)
            return false;
        else currentNode = currentNode->getEdge(edgeIndex)->getTo();
    }
    return (currentNode->childLoc('#') == -1)? false : true;    //this will be the index of the terminating character (if it exists), or -1.
}

How can the logic be generalised more elegantly? Perhaps using templates?


Solution

  • What I, personally, do in such circumstances, is to move the common code to the function, which I call from multiple places, where I need, said common functionality. Consider this:

    Node* Trie::getCurrentNode (string S){
        Node* currentNode = &nodes[0];  
        for(unsigned int i = 0; i < S.length(); i++){
            int edgeIndex = currentNode->childLoc(S.at(i));
            if(edgeIndex == -1)
                return nullptr;
            else currentNode = currentNode->getEdge(edgeIndex)->getTo();
        }
        return currentNode;
    }
    

    And then, use it, in all the cases, where you need it:

    bool Trie::substring(string S){
        return getCurrentNode (S) != nullptr;
    }
    
    bool Trie::suffix(string S){
        Node* currentNode = getCurrentNode(S);
        return currentNode != nullptr && currentNode->childLoc('#') != -1;
        // I decided to simplify the return statement in a way François Andrieux suggested, in the comments. It makes your code more readable.
    }