Over last two hours I have a lot of sexy time with Thai Script strings that slipped in my database. They collate mysteriously, mutate when output, do not have natural order and are a disaster.
I want to just ignore any strings with Thai Script characters, but I have no idea how:
Pattern.compile("\\p{Thai}")
fails on init. "[ก-๛]"
- would that ever work? What's the correct way?
Thai
is a Unicode block, and Unicode blocks should be specified as \p{In...}
:
Pattern.compile("\\p{InThai}")