Search code examples
regexre2

How to match a string ending with : or not in regex


I am using re2.h and doing partial match.

Assume the input string is "123.45.34.5:8080". "123.45.34.5" and "8080" should be returned.

Assume input string is "123.45.34.5". "123.45.34.5" and "" should be returned, too. How to write the regex? The following code does not work.

string portRegex = "[ \r\t]*([0-9a-f]*)[ \r\t]*";
string IPRegex = "([^ \r\t]*)^[^:]*";
string alertRegexStr = IPRegex + portRegex;
m_alertRegex = new RE2(alertRegexStr.c_str());

   bool match = RE2::PartialMatch(input_string,*m_alertRegex,
                            &cip,
                            &source_port);

Thanks,

UPDATE

Now the following code works.

string IPRegex = "([^ \r\t:]*)";

string portRegex = "[ \r\t]*:?[ \r\t]*([0-9a-f]*)[ \r\t]*";

But I have a question, why "string IPRegex = "([^ \r\t:]*?)";" does not work? What is the difference between *? and *?


Solution

  • In order to capture both parts around :, you can use

    ^([^:]*)(?::([^:]+))?$
    

    See demo, results are in the capture groups 1 and 2. (The \n in the regex demo is used for demo purposes as the multiline mode is on.)

    Regarding your question

    why string IPRegex = "([^ \r\t:]*?)"; does not work? What is the difference between *? and *?

    It works, but matches empty strings between each character and each separate characters, as it can match an empty string.

    Note that *? is a lazy quantifier that matches 0 or more characters but as few as possible. It guarantees that the quantified character class only matches as many characters as needed for the rest of your pattern to succeed. In other regex flavors, you could use a positive look-ahead (?=:), but re2 does not support look-arounds.

    More details on lazy matching can be found at rexegg.com and regular-expressions.info.