Would need help on how to extract the multiple passport numbers matching after a passport keyword using a regex's
Text:
my friends passport numbers are V123456, V123457 and V123458
Regex:
(?<=passport)\s*(?:\w+\s){0,10}\s*(\b[a-zA-Z]{0,2}\d{6,12}[a-zA-Z]{0,2}\b)
Expected matches output:
V123456
V123457
V123458
Actual output:
V123456
You can't rely on a lookbehind here since you would need a pattern of an indefinite length. It is supported, but only in recent Java versions.
You may use a pattern based on the \G
operator:
(?:\G(?!\A)|\bpassport\b).*?\b([a-zA-Z]{0,2}\d{6,12}[a-zA-Z]{0,2})\b
See the regex demo. Pattern details:
(?:\G(?!\A)|\bpassport\b)
- either a whole word passport (\bpassport\b
) or (|
) the end of the previous successful match (\G(?!\A)
).*?
- any zero or more chars as few as possible (since the pattern is compiled with Pattern.DOTALL
, the .
can match any characters including line break characters)\b([a-zA-Z]{0,2}\d{6,12}[a-zA-Z]{0,2})\b
- a whole word that starts with zero, one or two ASCII letters, then has six to 12 digits and ends with zero, one or two ASCII letters.See the Java demo below:
String s = "my friends passport numbers are V123456, V123457 and V123458";
String rx = "(?:\\G(?!^)|\\bpassport\\b).*?\\b([a-zA-Z]{0,2}\\d{6,12}[a-zA-Z]{0,2})\\b";
Pattern pattern = Pattern.compile(rx, Pattern.DOTALL);
Matcher matcher = pattern.matcher(s);
while (matcher.find()){
System.out.println(matcher.group(1));
}
Output:
V123456
V123457
V123458