Search code examples
regexpcreglibvala

Glib regex for matching whole word?


For matching a whole word, the regex \bword\b should suffice. Yet the following code always returns 0 matches

try {
        string pattern = "\bhtml\b";
        Regex wordRegex = new Regex (pattern, RegexCompileFlags.CASELESS, RegexMatchFlags.NOTEMPTY);
        MatchInfo matchInfo;
        string lineOfText = "<!DOCTYPE html PUBLIC \"-//W3C//DTD XHTML 1.0 Transitional//EN\" \"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd\">";

        wordRegex.match (lineOfText, RegexMatchFlags.NOTEMPTY, out matchInfo);
        stdout.printf ("Match count is: %d\n", matchInfo.get_match_count ());
    } catch (RegexError regexError) {
        stderr.printf ("Regex error: %s\n", regexError.message);
    }

This should be working as testing the \bhtml\b pattern returns one match for the provided string in testing engines. But on this program it returns 0 matches. Is the code wrong? What regex in Glib would be used to match a whole word?


Solution

  • It looks like you have to escape the backslash too:

    try {
            string pattern = "\\bhtml\\b";
            Regex wordRegex = new Regex (pattern, RegexCompileFlags.CASELESS, RegexMatchFlags.NOTEMPTY);
            MatchInfo matchInfo;
            string lineOfText = "<!DOCTYPE html PUBLIC \"-//W3C//DTD XHTML 1.0 Transitional//EN\" \"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd\">";
    
            wordRegex.match (lineOfText, RegexMatchFlags.NOTEMPTY, out matchInfo);
            stdout.printf ("Match count is: %d\n", matchInfo.get_match_count ());
        } catch (RegexError regexError) {
            stderr.printf ("Regex error: %s\n", regexError.message);
        }
    

    Output:

    Match count is: 1
    

    Demo