With this regex, I would like to match time with or without a milliseconds (ms) field. For completeness, I write the regex here (I removed the anchors in regex101 to enable multi-line):
^(0[0-9]|1[0-9]|2[0-3]):([0-5][0-9]):([0-5][0-9])(?:|(?:\.)([0-9]{1,6}))$
I kind of don't understand the C++ behavior of this. Now you see in regex101, the number of capture groups depends on the string. If there's no ms, it's 3+1 (since C++ uses match[0] for the matched pattern), and if there's ms, then it's 4+1. But then in this example:
std::regex timeRegex = std::regex(R"(^(0[0-9]|1[0-9]|2[0-3]):([0-5][0-9]):([0-5][0-9])(?:|(?:\.)([0-9]{1,6}))$)");
std::smatch m;
std::string strT = std::string("12:00:09");
bool timeMatch = std::regex_match(strT, m, timeRegex);
std::cout<<m.size()<<std::endl;
if(timeMatch)
{
std::cout<<m[0]<<std::endl;
std::cout<<m[1]<<std::endl;
std::cout<<m[2]<<std::endl;
std::cout<<m[3]<<std::endl;
std::cout<<m[4]<<std::endl;
}
We see that m.size()
is always 5, whether there is or not an ms field! m[4]
is an empty string if there's no ms field. Is this behavior the default one in regex of C++? Or should I try/catch (or some other safety measure) when in doubt of the size? I mean... even the size is a little misleading here!
m.size()
will always be the number of marked subexpressions in your expression plus 1 (for the whole expression).
In your code you have 4 marked subexpressions, whether these are matched or not has no effect on the size of m
.
If you want to now if there are milliseconds, you can check:
m[4].matched