I want to have a matching regexp pattern that matches all the addresses that end in 4 or more digits, but not coming after 'APT'
, 'BOX'
, 'APT '
, or 'BOX '
.
So it should match the following cases:
HITME 1234
HITME 12345
HITME1234
but not the following cases:
BOX 1234
BOX 12345
BOX4044
APT 1234
APT 12345
NONHIT123
NONHIT 123
I have made this one
(?<!(APT |BOX ))([0-9]{4,})$
but it does not work right. Somehow still matches the no-no cases.
TL;DR use ^(?!APT|BOX).*?([0-9]{4,})$
Your regex (?<!(APT |BOX ))([0-9]{4,})$
incorrectly matches:
BOX 12345
on 2345
because it is not preceded by APT
or BOX
. Instead, it is preceded by BOX 1
BOX4044
on 4044
because it is not preceded by APT
or BOX
. Instead, it is preceded by BOX
APT 12345
on 2345
for a similar reason.The regex you're looking for is ^(?!APT|BOX).*?([0-9]{4,})$
, which is broken down like so:
^(?!APT|BOX)
- the beginning of the string cannot be followed by APT
or BOX
.*?
- a bunch of garbage in the middle of the string, taking as few characters as possible (i.e. HITME
in your test cases)([0-9]{4,})$
- the matched digits at the end of the string