How do I not capture or detect matches if the regex pattern precedes this regex pattern r"(?<=\s)|^)dont\s*"
This is the pattern that you want to use to exclude matches. It correctly uses a lookbehind "(?<=\s|^)dont"
to check for a space or the start of the string before the word "dont"
. This ensures that the word "dont" is not preceded by any characters other than spaces or the start of the string.
Basically, what I am looking to achieve is that if there is a "dont"
before the original pattern that has a space "\s"
or the beginning of the string "^"
, then it does not detect the match and therefore does not capture the capture group either.
import re
#example 1 with capture, because it does not match this part of the pattern (?<=\s)|^)
#input_text = "I think Idont like a lot red apples"
#example 2 not capture
input_text = "I think I dont like a lot red apples"
interests_match = re.search(r"(?:like\s*a\s*lot\s+(.+?)", input_text, flags = re.IGNORECASE)
if interests_match: print(interests_match.group(1))
The correct output for each example:
"red apples" #example 1
None #example 2
This should do what you want.
r"(?:(?:^|\s)dont.*)|(?:like\s*a\s*lot\s+)(.+)"
The pattern on the left side of the second |
will skip the rest of the line if it has ^dont
or \sdont
in it, so that the (.+)
will not capture anything.
Note: You will need to check that the group 1 match exists so that you don't get an error.