I have to extract single-line comments from qmake
project file.
Rules are simple: comment begins with #
symbol and begin with line-break \n
.
So i'm read some documentation about QRegExp
, and write such code to print all comments in qmake file:
QRegExp re ("#(.*)\n$");
re.setMinimal (true);
int comment_index = 0;
while ((comment_index = _project_contents.indexOf (comment_expr, comment_index)) != -1)
{
QString comment_text = comment_expr.cap (0);
qDebug() << "Comment 1" << comment_text;
}
But it is not work correctly - just all contents of project file has been printed. Where is my mistake? as i understand from docs, this should work, but it doesn't.
P.S. I'm a newbie in regexes, so please don't beat me :)
The problem is that .
"matches any character (including newline).". And the $
is the end of the string.
You could try using not-newline - [^\n]
and changing the $
to (\n|$)
(newline or end of string):
"#[^\n]*(\n|$)"
But then this matches #
anywhere instead of just at the start of a line, so let's try this:
"(^|\n)#[^\n]*(\n|$)"
^
is the start of the string, so basically (^|\n)
(start of string or new line) is just before the start of a line.
Can you see a problem there? What if you have 2 comments in 2 consecutive lines? You'll only match the first, since the new-line will be consumed during matching the first (since the next match starts where the previous one finished).
A work-around for this is using look-ahead:
"(^|\n)#[^\n]*(?=\n|$)"
This causes the end newline to not be included in the match (but it is still checked), thus the position will be just before the new-line and the next match can use it.
Can the #
be preceded by spaces? If so, check for zero or more spaces (\s*
):
"(^|\n)\s*#[^\n]*(?=\n|$)"