Search code examples
c++regexmemory-leakscrashboost-xpressive

wsregex::compile crashes (memory leak) when handling regex string?


I would like to understand why my program crashes when I try to use the wsregex::compile of BOOST with the following string:

(?P<path>\b[a-z]:\\(?:[^\\/:*?"<>|\r\n]+\\)*[^\\/:*?"<>|\r\n]*)?
(:)?
(?P<ip>(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\b)
(;(?P<port>\d*))?
(:(?P<port>\b\d+\b):(?P<password>[\w]*))?
(:(?P<password>\b\d+\b))?

In regex buddy everything appears to be fine. I used the JGSoft flavor option on RegexBuddy.

I am validating the following:

c:\My Documents\Test\test.csv:1.12.12.13:111:admin
c:\My Documents\Test\test.csv:1.12.12.13:111
c:\My Documents\Test\test.csv:1.12.12.13;111
1.12.12.13:111
1.12.12.13;111

Can you guys help me. Thanks a lot.


Solution

  • This is neither a memory leak nor a crash as far as I can tell. Xpressive is throwing an exception because this is an invalid pattern. The following program:

    #include <iostream>
    #include <boost/xpressive/xpressive_dynamic.hpp>
    
    namespace xpr = boost::xpressive;
    
    int main()
    {
        const char pattern[] =
            "(?P<path>\\b[a-z]:\\\\(?:[^\\\\/:*?\"<>|\\r\\n]+\\\\)*[^\\\\/:*?\"<>|\\r\\n]*)?"
            "(:)?"
            "(?P<ip>(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\b)"
            "(;(?P<port>\\d*))?"
            "(:(?P<port>\\b\\d+\\b):(?P<password>[\\w]*))?"
            "(:(?P<password>\\b\\d+\\b))?";
        try
        {
            xpr::sregex rx = xpr::sregex::compile(pattern);
        }
        catch(xpr::regex_error const & e)
        {
            std::cout << e.what() << std::endl;
        }
    }
    

    Outputs:

    named mark already exists
    

    Indeed, it does. This pattern uses "port" and "password" twice as the name of a capturing group. Xpressive doesn't like that. Just pick unique names for your captures and you should be fine.