Search code examples
stringjavajava-7java-6

String comparison - JDK versions produce different results


I've implemented a basic Guava predicate:

containsPatternIgnoreCase(@Nullable final String input) {
   checkNotNull(input);
   return new Predicate<String>() {
       @Override
       public boolean apply(@Nullable String current) {
           checkNotNull(current);
           return current.toLowerCase(ENGLISH).contains(input.toLowerCase(ENGLISH));
       }
   };
}

All is fine, but ONE test case fails on Travis:

assertThat(containsPatternIgnoreCase("TURKİYE").apply("turkiye güzel")).isTrue();

I took care of not inheriting from the default locale in my implementation, so I really wonder what can be wrong there. Could it depend on the JDK versions?

Here is what's used on my machine:

java version "1.6.0_45"
Java(TM) SE Runtime Environment (build 1.6.0_45-b06-451-10M4406)
Java HotSpot(TM) 64-Bit Server VM (build 20.45-b01-451, mixed mode)

And on Travis CI:

java version "1.7.0_17"
Java(TM) SE Runtime Environment (build 1.7.0_17-b02)
Java HotSpot(TM) 64-Bit Server VM (build 23.7-b01, mixed mode)

Thanks a lot for your help! Rolf


Solution

  • Casefolding is a complicated problem, and it is apparently not possible to get is "right" unless you use the right locale.

    This W3 page deals with this: http://www.w3.org/International/wiki/Case_folding

    Yes, you probably have found a JDK dependency here. But the fix is probably not to expect case-folding to be consistent if the locale doesn't match the language of the text (or text fragment) you are processing.