Search code examples
javastringsplitapache-stringutils

String.split vs StringUtils.split in Java gives different results


Consider a string like below with delimiter __|__.

String str = "a_b__|__c_d";

str.split("__\\|__") gives 2 splits a_b and c_d StringUtils.split(str, "__|__") or StringUtils.split(str, "__\\|__") gives 4 splits a, b, c, d which is not desired.

Is there any way to make StringUtils.split() to give same results String.split()?


Solution

  • String.split() has some very surprising semantics, and it's rarely what you want. You should prefer StringUtils (or Guava's Splitter, discussed in the previous link).

    Your specific issue is that String.split() takes a regular expression, while StringUtils.split() uses each character as a separate token. You should use StringUtils.splitByWholeSeparator() to split on the contents of the full string.

    StringUtils.splitByWholeSeparator(str, "__|__");