I have a regex with a mandatory part, a non-greedy (lazy?) part, an optional part and finally another non-greedy part.
<mandatory><non-greedy><optional><non-greedy>
Implemented as:
^mandatory.*?(:?optionalpart)?.*?$
The optionalpart consists of 'a piece to find' and 'a piece to return in a capture group'.
^mandatory.*?(:?findme(matchme))?.*?$
But for some inputs the first non-greedy part consumes characters that the following optional part should match. Is there a way to make the optional part more greedy than the previous non-greedy part?
Example: Find the character after the 2,
, or find an empty string if there is no 2,
but the mandatory part matches.
"Foo: 2,b,1,a,3,c" -> match, $1 = "b"
"Foo: 1,a,2,b,3,c" -> match, $1 = "b"
"Foo: 1,a,3,c,2,b" -> match, $1 = "b"
"Foo: 2,b" -> match, $1 = "b"
"Foo: 1,a,3,c" -> match, $1 = ""
"Fuu: 1,a,2,b,3,c" -> no match.
Attempt 1: ^Foo: .*?(?:2,([a-z]))?.*?$
This fails on the 2nd and 3rd example, returning ""
instead of "2"
.
Attempt 2: ^Foo: .*?(?:2,([a-z])).*?$
This fixes the previous fails, but now fails on the 5th example, not matching.
The part that must be optional is no longer optional.
If it matters, I'm using Java's Pattern class.
--
This was asked before, but there was no satisfactory answer for either of us.
Your first regex is very close, you need to move (?:
a bit more to the left to include the .*?
pattern:
^Foo:(?: .*?2,([a-z]))?.*$
^^^
See the regex demo
Details
^
- start of stringFoo:
- some literal text(?: .*?2,([a-z]))?
- an optional non-capturing group that matches greedily (will be tried at least once) 1 or 0 occurrences of:
.*?
- space followed with any 0+ chars other than line break chars, as few as possible2,
- a literal substring([a-z])
- Group 1: a lowercase letter.*
- any 0+ chars other than line break chars (the rest of the string)$
- end of string.The general pattern will look like
^<MANADATORY_LITERAL>(?:<NON_GREEDY_DOT>(<OPTIONAL_PART>))?<GREEDY_DOT>$