I just asked a similar question to this one, and there was an excellent and accurate answer, but it turns out I now have a brand new problem. It turns out I have a single line of relevant input. I'm not sure how to ask this in an abstract way so I'll just jump right into my input:
(EDITED to provide a better example)
bear999bear888bear777bear666fox---bear222bear333bear444bear555fox
(The items between the markers are not necessarily numeric)
This is the expression (EDITED to match updated input example):
bear.*bear(?<matchString>(.(?!bear.*bear))*?)bear.*fox
It's returning 444. Is there a way that I can tweak this to return both 444 and 777? It seems to be skipping over the first match and favoring only the latter. I have the ! exclusion so that it matches only the innermost on the left side.
I've been testing here: http://regexlib.com/RETester.aspx
This works great when I break it into two lines and turn on multi-line. Why does it stop working when the input is on a single line?
Any advice would be appreciated!
This should work (it does work in that regex tester you've linked in the question):
(?<=bear)(?:(?!bear).)*(?=bear(?:(?!bear).)*fox)
It reads like "let's match something that is preceded by bear
, has no bear
sequence within, and is followed by the bear
- no bear
- fox
sequence".
The capturing groups are absent here; the whole match is what you need.
And yes, I just can't help wondering why should this be done with a single regex when it actually looks like a work for a tokenizer. ) For example, you can split your line by 'fox'
first, then split each part by 'bear'
- and take the one before the last one of each result.