The code snippet for positive lookbehind is below
public class PositiveLookBehind {
public static void main(String[] args) {
String regex = "[a-z](?<=9)";
String input = "a9es m9x us9s w9es";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(input);
System.out.println("===starting====");
while(matcher.find()) {
System.out.println("found:"+matcher.group()
+" start index:"+matcher.start()
+" end index is "+matcher.end());
}
System.out.println("===ending=====");
}
}
I was expecting that I should have 4 matches but to my surprise the output shows no match.
Can anyone point out my mistake?
As far as my understanding goes the regex here is alphabet preceded by digit 9 which is satisfied in 4 locations.
Notice that (?<=9)
is placed after [a-z]
. What it means?
Lets consider data like "a9c"
.
At start regex-engine places its "cursor" at start of the string which it iterates, here:
|a9c
^-regex cursor is here
Then regex-engine is trying to match each part of regex-pattern from left to right. So in case of [a-z](?<=9)
it first will try to find match for [a-z]
and after successfully finding that match for it, it will try to move to evaluation of (?<=9)
part.
So match for [a-z]
will happen here:
a9c
*<-- match for `[a-z]`
After that match regex will move cursor here:
a|9c
*^--- regex-engine cursor
^---- match for [a-z]
So now (?<=9)
will be evaluated (notice position of cursor |
). (?<=subregex)
checks if immediately before cursor exist text which can be matched by subregex
. But here since cursor is directly after a
(?<=9)
look-behind "sees"/includes that a
as data which subexpression should test. But since a
can't be matched by 9
evaluation fails.
You probably wanted to check if 9
is placed before acceptable letter. To achieve that you can modify your regex in many ways:
with [a-z](?<=9.)
you make look-behind test two previous characters
a9c|
^^
9. - `9` matches 9, `.` matches any character (one directly before cursor)
or simpler (?<=9)[a-z]
to first look for 9
and then look for [a-z]
which will let regex match 9c
if cursor will be at 9|c
.