Search code examples
c++stringc++17case-insensitive

c++ vector string problem about case sensitivity


so in c++ 'A' and 'a' are different characters, if we have a vector that contains both upper and lowercase letters, how to write a function that transforms this vector into some vector that is case insensitive, for example, 'ABba' becomes the same as 'abba'.

so for example, I want to count the number of different characters within the string, for example, "ABba" in this case output must be 2, because A a are 1 same group, B and b same as well, this string also may contain numbers, for example. ABba1212 --> answer should be 4.


Solution

  • The standard approach for doing case insensitive comparisons is:

    Decide for either upper or lower case and convert all letters to this case. Then, do your operations.

    In C++ you have a family of functions for that purpose. std::toupperand std::tolower. Please check in CPP Reference.

    If you know what character set you have, there are also other possibilities. In western countries very often ASCII characters are uses. In that case, you can even do bit operations for conversions. Or, additions or subtractions.

    Some examples for converting to lower case with ASCII characters

    #include <iostream>
    #include <cctype>
    
    int main() {
    
        char c = 'A';
        c = (char)std::tolower(c);
        std::cout << "\nWith tolower:  " << c << '\n';
    
        c = 'A';
        if (c >= 'A' and c <= 'Z') c += ('a' - 'A');
        std::cout << "\nWith addition:  " << c << '\n';
    
        c = 'A';
        if (c >= 'A' and c <= 'Z') c |= 32;
        std::cout << "\nWith bit operation:  " << c << '\n';
    }
    

    Next, counting different characters. If you want to count the different characters in a string, then you need to iterate over it, and check, if you saw the character or not. There are really many different solutions for that.

    I will show you a very basic one and then a C++ solution.

    It is made for 8 bit char values.

    #include <iostream>
    #include <cctype>
    #include <string>
    
    int main() {
    
        std::string test{"ABba1212"};
    
        // There are 256 different 8 bit char values. Create array and initialize everything to false
        bool weHaveACharValueForThisASCII[256] = {};
    
        // Iterate over all characters in the source string
        for (char c : test)
            // And mark, if we found a certain char
            weHaveACharValueForThisASCII[std::tolower(c)] = true;
    
        // Now we want to count, how many different chars we found
        int sum = 0;
        for (bool b : weHaveACharValueForThisASCII) if (b) ++sum;
    
        // Show result
        std::cout << sum;
    
        return 0;
    }
    

    In C++ you would use a std::unordered_set for this. It can only contain unique values and uses fast hashing fordata access. Please see here.

    #include <iostream>
    #include <cctype>
    #include <string>
    #include <unordered_set>
    
    int main() {
    
        std::string test{"ABba1212"};
    
        // Here we will store the unique characters
        std::unordered_set<char> unique{};
    
        // Iterate over all characters in the source string
        for (char c : test) unique.insert((char)std::tolower(c));
    
        // Show result
        std::cout << unique.size();
    }