I'm pretty new to the regex world.
Given a list of Strings as input, I would like to split them by using a regex of punctuations pattern: "[!.?\n]"
.
The thing is, I would like to specify that if there are multiple punctuations together like this:
input: "I want it now!!!"
output: "I want it now!!"
input: "Am I ok? Yeah, I'm fine!!!"
output: ["Am I ok"
, "Yeah, I'm fine!!"
]
You can use
[!.?\n](?![!.?\n])
Here, a !
, .
, ?
or newline are matched only if not followed with any of these chars.
Or, if the char must be repeated:
([!.?\n])(?!\1)
Here, a !
, .
, ?
or newline are matched only if not followed with exactly the same char.
See the regex demo #1 and the regex demo #2.
See a Java demo:
String p = "[!.?\n](?![!.?\n])";
String p2 = "([!.?\n])(?!\\1)";
String s = "I want it now!!!";
System.out.println(Arrays.toString(s.split(p))); // => [I want it now!!]
System.out.println(Arrays.toString(s.split(p2))); // => [I want it now!!]
s = "Am I ok? Yeah, I'm fine!!!";
System.out.println(Arrays.toString(s.split(p))); // => [Am I ok, Yeah, I'm fine!!]
System.out.println(Arrays.toString(s.split(p2))); // => [Am I ok, Yeah, I'm fine!!]