Search code examples
c#.netregexcapturing-group

Regex with optional part doesn't create backreference


I want to match an optional tag at the end of a line of text.

Example input text:

The quick brown fox jumps over the lazy dog {tag}

I want to match the part in curly-braces and create a back-reference to it.

My regex looks like this:

^.*(\{\w+\})?

(somewhat simplified, I'm also matching parts before the tag):

It matches the lines ok (with and without the tag) but doesn't create a back-reference to the tag.

If I remove the '?' character, so regex is:

^.*(\{\w+\})

It creates a back-reference to the tag but then doesn't match lines without the tag.

I understood from http://www.regular-expressions.info/refadv.html that the optional operator wouldn't affect the backreference:

Round brackets group the regex between them. They capture the text matched by the regex inside them that can be reused in a backreference, and they allow you to apply regex operators to the entire grouped regex.

but must've misunderstood something.

How do I make the tag part optional and create a back-reference when it exists?


Solution

  • It is not a backreference problem, the problem is that the regular expression was satisfied by just reading in the text that matched .*. It didn't feel compelled to continue reading to read the optional end-tag. The simplest solution if you're truly reading to the end of the line is to append a $ (dollar sign) to force the regular expression to match the whole line.

    edit

    BTW, I didn't take your reg-ex literally since you said it matches other stuff, but just to be clear .* will consume the whole line. You'd need something like [^{]* to prevent the tag from getting swallowed. I'm guessing that's not a problem for you.