With String.split
it is easy to split a string by multiple separators. You just needs to define a regular expression which matches all separators you want to use. For example
"1.22-3".split("[.-]")
results in the list with the elements "1"
, "22"
, and "3"
. So far so good.
Now however I also need to know which one of the separators was found between the segments. Is there a straightforward way to achieve this?
I looked at String.split
, its deprecated predecessor StringTokenizer
, and other supposedly more modern libraries (e.g. StrTokenizer
from Apatche Commons), but with none of them I can get hold of the matched separator.
It’s quite simple if you retrace what String.split(regex)
does and record the information which String.split
ignores:
String source = "1.22-3";
Matcher m=Pattern.compile("[.-]").matcher(source);
ArrayList<String> elements=new ArrayList<>();
ArrayList<String> separators=new ArrayList<>();
int pos;
for(pos=0; m.find(); pos=m.end()) {
elements.add(source.substring(pos, m.start()));
separators.add(m.group());
}
elements.add(source.substring(pos));
At the end of this code, separators.get(x)
yields to the separator between elements.get(x)
and elements.get(x+1)
. It should be clear that separators
is one item smaller than elements
.
If you want to have elements and separators in one list, just change the code to let these two lists be the same list. The items are already added in order of occurrence.