Search code examples
c++binary-treebinary-search-treestring-comparison

How do I compare strings (insensitive case), while inserting same strings into a binary Tree (sensitive case)?


Essentially, I am a brick wall as to how I should compare strings with my insert function, not taking case into consideration while simultaneously inserting those same strings with their original case.

Here is my insert function.

TreeNode* Tree::insert(TreeNode* node, string value) {
    transform(value.begin(), value.end(), value.begin(), ::tolower);

    if (node == nullptr) {
        return newTreeNode(value);
    }
    if (node->data < value) {
        node->left = insert(node->left, value);
    }
    else if(node-> data > value) {
        node->right = insert(node->right, value);
    }
    else {
        return node;
    }
    node->height = 1 + max(height(node->left), height(node->right));
    return node;
}

Here is my tree header file:

struct TreeNode {
public:
    TreeNode* left;
    TreeNode* right;
    string data;
};

class Tree {
public:
    TreeNode * newTreeNode(string data);
    TreeNode * insert(TreeNode* node, string value);
    void lexographicPrint(TreeNode* root);
};

newTreeNode Funciton:

TreeNode* AvlTree::newTreeNode(string value) {
    TreeNode* treeNode = new TreeNode();
    treeNode->data = value;
    treeNode->left = nullptr;
    treeNode->right= nullptr;
    treeNode->height = 1;
    return treeNode;
}

Print Function:

void AvlTree::lexographicPrint(TreeNode* root) {
    if (root != nullptr) {
        lexographicPrint(root->right);
        cout << root->data << " ";
        lexographicPrint(root->left);
    }
}

This currently works as I want it to except for the fact that the Tree contains all values as lower case, obviously due to the transform function. I have tried using a holdValue, like so:

string holdValue;
if (isupper(value[0]) {
    holdValue = value;
}

at the top of my function, replacing all insert calls with holdValue. I am confused as to why that changes the order of my tree when comparisons are still made with value. I expected that to work, but it does not. I have yet to find a solution through Google searches.


Solution

  • Rather than use std::string's <, you can use a case insensitive comparison.

    bool ci_less(const std::string & lhs, const std::string & rhs) {
        return std::lexicographical_compare(lhs.begin(), lhs.end(), rhs.begin(), rhs.end(), [](char l, char r){ return std::to_lower(l) < std::tolower(r); });
    }
    
    TreeNode* Tree::insert(TreeNode* node, std::string value) {
    
        if (node == nullptr) {
            return newTreeNode(std::move(value));
        }
        if (ci_less(node->data, value)) {
            node->left = insert(node->left, std::move(value));
        }
        else if(ci_less(value, node->data)) {
            node->right = insert(node->right, std::move(value));
        }
        else {
            return node;
        }
        node->height = 1 + max(height(node->left), height(node->right));
        return node;
    }
    

    You will need to #include <algorithm> for std::lexicographical_compare.

    In a similar vein, you could instead define a case insensitive string type

    struct ci_char_traits : public std::char_traits<char> {
        static char to_upper(char ch) {
            return std::toupper((unsigned char) ch);
        }
        static bool eq(char c1, char c2) {
             return to_upper(c1) == to_upper(c2);
         }
        static bool lt(char c1, char c2) {
             return to_upper(c1) <  to_upper(c2);
        }
        static int compare(const char* s1, const char* s2, size_t n) {
            while ( n-- != 0 ) {
                if ( to_upper(*s1) < to_upper(*s2) ) return -1;
                if ( to_upper(*s1) > to_upper(*s2) ) return 1;
                ++s1; ++s2;
            }
            return 0;
        }
        static const char* find(const char* s, int n, char a) {
            auto const ua (to_upper(a));
            while ( n-- != 0 ) 
            {
                if (to_upper(*s) == ua)
                    return s;
                s++;
            }
            return nullptr;
        }
    };
    
    using ci_string = std::basic_string<char, ci_char_traits>;