Suppose I want to return all occurrences of 'lep' in a string in Python, but not if an occurrence is in a substring like 'filepath' or 'telephone'. Right now I am using a combination of negative lookahead/lookbehind:
(?<!te|fi)lep(?!hone|ath)
However, I do want 'telepath' and 'filephone' as well as 'filep' and 'telep'. I've seen similar questions but not one that addresses this type of combination of lookahead/behind.
Thanks!
You can place lookaheads inside lookbehinds (and vice-versa; any combination, really, so long as every lookbehind has a fixed length). That allows you to combine the two conditions into one (doesn't begin with X and end with Y):
lep(?<!telep(?=hone))(?<!filep(?=ath))
Putting the lookbehinds last is more efficient, too. I would advise doing it that way even if there's no suffix (for example, lep(?<!filep)
to exclude filep
).
However, generating the regexes from user input like lep -telephone -filepath
promises to be finicky and tedious. If you can, it would be much easier to search for the unwanted terms first and eliminate them. For example, search for:
(?:telephone|filepath|(lep))
If the search succeeds and group(1)
is not None
, it's a hit.