I try to parse the content of the file with a regex:
ifstream file_stream("commented.cpp",ifstream::binary);
std::string txt((std::istreambuf_iterator<char>(file_stream)),
std::istreambuf_iterator<char>());
cmatch m;
bool result = regex_search(txt.c_str(), m, regex("^#(\S*)$",regex_constants::basic));
The file is a c source, and it begins with the line:
#include <stdio.h>
I'm trying to parse a directive, i checked the regexp in regexbuddy and it works 100%, but in std::regex regex_search
returns false. It seems that $
character is not gettinc recognized and also ^
for the posix
syntax. I have tried to use ECMAScript
, and the regex works, only if i remove $
symbol.
//ecmascript syntax
bool result = regex_search(txt.c_str(), m, regex("^#(\S*)"));
The file is read using binary flag, so the txt
string, keeps all \r\n
characters which are required for $
syntax. I look for help, how to resolve this issue.
Note that the $
anchor in most cases works only as an end-of-string (whole input) anchor. See this thread. You may make $
match end of a line position by using a custom boundary pattern based on a positive lookahead, (?=$|\r?\n)
.
Another issue is that you are using \S
escape sequence in a regular string literal. That means, it is treated as an S
letter, not as a non-whitespace pattern. Use a raw string literal so that you could use a single \
to define a regex escape sequence (where \
escaping d
, s
, etc. should be literal backslashes). Or double escape \
in regular string literals.
Also, @HWalters already noted that the ^#\S+$
will not match #include <stdio.h>
, you need to account for a space inside. Thus, you regex might look like ^#include[ \t]+(\S+)(?=$|\r?\n)
, to make sure you have #include
, then some horizontal spaces, and then capture any number (1 or more here, with +
) of non-whitespace chars up to the end of string or a line break (CRLF or LF).
And here is a snippet:
regex r(R"(^#include[ \t]+(\S+)(?=$|\r?\n))");
string s("#include <stdio.h>\r\n#include <regex>");
smatch m;
if (regex_search(s, m, r)) {
std::cout << m[1] << std::endl;
}