I am trying to craft a delimiter regex (for use with java.util.Scanner) that segments a string on whitespace, as well as keeping colons, opening parenthesis and closing parenthesis as separate tokens. That is, foo(a:b)
should segment into the tokens foo
, (
, a
, :
, b
and )
.
My current best effort is the pattern "\\s+|(?=[(:])|(?<=[:)])"
which for some reason I can't understand fails to match after the opening parenthesis and before the closing parenthesis, but matches fine on both sides of the colon.
If you want all those separate parts, you could extend the character classes asserting one of the characters [(:)]
at the left and, if this is the whole string, assert one of the characters [(:]
at the right.
If you also want to match the position after the last closing parenthesis, both character classes can be the same [(:)]
\s+|(?=[(:)])|(?<=[(:])
Example code
String s = "foo(a:b)";
Scanner scanner = new Scanner(s).useDelimiter("\\s+|(?=[(:)])|(?<=[(:])");
while(scanner.hasNext())
{
System.out.println(scanner.next());
}
Output
foo
(
a
:
b
)