Search code examples
c++boostboost-spiritcase-insensitiveboost-spirit-lex

Case-insensitive keywords with boost::spirit::lex


Is there a way to recognize specific patterns case-insensitively?

E.g. if I have

literal_bool = L"True|False";
this->self.add(literal_bool, TokenId_LiteralBool);

How can I match true, TRUE, tRuE while avoiding to write [Tt][Rr][Uu][Ee] for each keyword?


Solution

  • Regular expressions supported by boost::spirit::lex include a case-sensitivity control:

    (?r-s:pattern)

    apply option 'r' and omit option 's' while interpreting pattern. Options may be zero or more of the characters 'i' or 's'. 'i' means case-insensitive. '-i' means case-sensitive. 's' alters the meaning of the '.' syntax to match any single character whatsoever. '-s' alters the meaning of '.' to match any character except '\n'.

    Thus you can write:

    literal_bool = L"(?i:true|false)";
    this->self.add(literal_bool, TokenId_LiteralBool);
    

    Original answer

    Introduce a function that makes a pattern case insensitive:

    literal_bool = L"True|False";
    this->self.add(make_case_insensitive(literal_bool), TokenId_LiteralBool);
    

    Implementation for regular (non-wide) strings:

    std::string make_case_insensitive(const std::string& s)
    {
        std::string r;
        std::string cC = "[xX]";
        for(char c : s)
        {
            if ( std::isalpha(c) )
            {
                cC[1] = std::tolower(c);
                cC[2] = std::toupper(c);
                r += cC;
            }
            else
                r += c;
        }
        return r;
    }