I have to use a regular expression to match several strings and capture parts of the string.
Example strings could look like:
The goal is to lazy match and capture the middle name of robert palmer up to the point where the surname (palmer) appears in the string AND ensure the rest of the string matches the static text (robert ___ palmer sent for the boat).
I have used a positive lookahead to find the middle name and stop matching if palmer is found:
/robert (.+?)(?=\spalmer) palmer/
which correctly matches;
robert eric palmer
robert eric william palmer
and correctly doesn't match;
robert eric william palmer palmer
The problem:
when I add the rest of the static text to the regex;
/robert (.+?)(?=\spalmer) palmer sent for the boat/
it incorrectly matches;
robert eric william palmer palmer sent for the boat
robert eric palmer palmer sent for the boat
How can I lazy match up to palmer for the middle name and still assert the rest of the static text matches?
I hope this makes sense!
You may use
robert ((?:(?!palmer).)+?) palmer sent for the boat
See the regex demo.
Details
robert
- a literal substring((?:(?!palmer).)+?)
- a capturing group #1 with a tempered greedy token that matches any char (.
), 1 or more occurrences but as few as possible, that does not start a palmer
char sequence palmer sent for the boat
- a literal substring.To unroll the pattern for better performance use
robert ([^p]*(?:p(?!almer)[^p]*)*) palmer sent for the boat
See this regex demo.