Tell me please, what can be used instead of \b to highlight words in the cyrillic text?
I have a text "текст" in SQLite database column.
it's working:
select * from myTable where text REGEXP 'текст'
it's not working:
select * from myTable where text REGEXP '\bтекст\b'
It turns out your SQLite REGEXP
implementation is based on PCRE.
You may make the \b
Unicode aware by using a (*UCP)
PCRE verb:
'(*UCP)\bтекст\b'
There is some details about the verb at pcrepattern man page:
Another special sequence that may appear at the start of a pattern is
(*UCP)
. This has the same effect as setting thePCRE_UCP
option: it causes sequences such as\d
and\w
to use Unicode properties to determine character types, instead of recognizing only characters with codes less than 128 via a lookup table.
And later:
Note also that
PCRE_UCP
affects\b
, and\B
because they are defined in terms of\w
and\W
. Matching these sequences is noticeably slower whenPCRE_UCP
is set.
Well, it will be slower since it has to deal with the whole Unicode table now.