My regular expression looks like this: "[a-zA-Z]+[ \t]*(?:,[ \t]*(\\d+)[ \t]*)*"
I can match the lines with this, but I don't know how to capture the numbers,I think it has to do something with grouping.
For example: from the string "asd , 5 ,2,6 ,8"
, how to capture the numbers 5 2 6 and 8?
A few more examples:
sdfs6df -> no capture
fdg4dfg, 5 -> capture 5
fhhh3 , 6,8 , 7 -> capture 6 8 and 7
asdasd1,4,2,7 -> capture 4 2 and 7
So I can continue my work with these numbers. Thanks in advance.
You could match the leading word characters and make use of the \G
anchor capturing the continuous digits after the comma.
Pattern
(?:\w+|\G(?!^))\h*,\h*([0-9]+)
Explanation
(?:
Non capture group\w+
Match 1+ word chars
-|
or
\G(?!^)
Assert postition at the end of previous match, not at the start)
Close non capturing group\h*,\h*
Match a comma between horizontal whitespace chars([0-9]+)
Capture group 1, match 1+ digitsIn Java with double escaped backslashes:
String regex = "(?:\\w+|\\G(?!^))\\h*,\\h*([0-9]+)";
Example code
String regex = "(?:\\w+|\\G(?!^))\\h*,\\h*([0-9]+)";
String string = "sdfs6df -> no capture\n\n"
+ "fdg4dfg, 5 -> capture 5\n\n"
+ "fhhh3 , 6,8 , 7 -> capture 6 8 and 7\n\n"
+ "asdasd1,4,2,7 -> capture 4 2 and 7";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(string);
while (matcher.find()) {
System.out.println(matcher.group(1));
}
Output
5
6
8
7
4
2
7