Search code examples
javastringsplitdelimitersplitter

Split string, put the elements into List with two possible delimiters


I would like to split string into List on two possible delimeters "/" or "//". But what is more the delimeter should also be put into the same List. I can not do this with Splitter in Guava or java.util.Scanner.

Scanner s = new Scanner(str);
s.useDelimiter("//|/");
while (s.hasNext()) {
    System.out.println(s.delimiter());
    System.out.println(s.next());
}

s.delimiter() returns //|/. I want to get / or //.

Do you know any other library which can do this?

I wrote some code, and it works but it is not very nice solution:

public static ArrayList<String> processString(String s) {
    ArrayList<String> stringList = new ArrayList<String>();
    String word = "";
    for (int i = 0; i < s.length(); i++) {
        if (s.charAt(i) == '/' && i < s.length() && s.charAt(i + 1) == '/') {
            if (!word.equals(""))
                stringList.add(word);
            stringList.add("//");
            word = "";
            i++;
        } else if (s.charAt(i) == '/') {
            if (!word.equals(""))
                stringList.add(word);
            stringList.add("/");
            word = "";
        }else{
            word = word + String.valueOf(s.charAt(i));
        }
    }
    stringList.add(word);
    return stringList;
}

On "some/string//with/slash/or//two" returns List with some, /, string, //, with, /, slash, /, or, //, two

On "/some/string//with/slash/or//two" returns List with /, some, /, string, //, with, /, slash, /, or, //, two

On "//some/string//with/slash/or//two" returns List with //, some, /, string, //, with, /, slash, /, or, //, two


Solution

  • The useDelimiter method has a signature that takes a Pattern object, instead of a String.

    You should use that one instead:

    Scanner s = new Scanner(str);
    s.useDelimiter(Pattern.compile("/{1,2}"));
    while (s.hasNext()) {
        System.out.println(s.delimiter());
        System.out.println(s.next());
    }
    

    In order to capture the delimiter, you're going to need to change your approach.

    Pattern p = new Pattern("(/{0,2})([^/]+)");
    Matcher m = p.matcher(str);
    while(m.find()) {
       String token     = m.group(2);
       String delimiter = m.group(1); // (preceding delimiter. may be null)
       /*
        * ...
        */
    }