Search code examples
androidregexlocale

PatternSyntaxException in non-latin locales


I've got a regex that was working perfectly fine until I switched my locale to 'fa' (Persian). I suspect this would happen with Hebrew and Arabic too (not yet sure if it's the characters or the RTL direction that makes it break).

The line of code causing the exception is:

public static final Pattern NAME_REGEX = Pattern.compile(String.format("^[\\w ]{%d,%d}$", 2,24));

(the syntax is fine, it works in English & Spanish) but when the app tries to compile the regex in the 'incompatible' locales, I get the following:

at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:605)
at dalvik.system.NativeStart.main(Native Method)
Caused by: java.util.regex.PatternSyntaxException: Syntax error U_REGEX_BAD_INTERVAL     near index 8:
^[\w ]{٢,٢٤}$
   ^
at java.util.regex.Pattern.compileImpl(Native Method)
at java.util.regex.Pattern.compile(Pattern.java:400)
at java.util.regex.Pattern.<init>(Pattern.java:383)
at java.util.regex.Pattern.compile(Pattern.java:374)
at com.airg.hookt.config.airGConstant.<clinit>(airGConstant.java:131)

Any help would be appreciated. Thanks


Solution

  • ANSWER

    So ... the problem was indeed the String.format

    Changing

    public static final Pattern NAME_REGEX = Pattern.compile(String.format("^[\\w ]{%d,%d}$", 2,24));
    

    to

    public static final Pattern NAME_REGEX = Pattern.compile("^[\\w ]{" + 2 + "," + 24 + "}$");
    

    fixed the crash. Thanks to everyone for their contribution.