Search code examples
regexvimunicode

Vim regex matches unicode characters are as non-word


I have the following text:

üyü

The following regex search matches the characters ü:

/\W

Is there a unicode flag in Vim regex?


Solution

  • Unfortunately, there is no such flag (yet).

    Some built-in character classes (can) include multi-byte characters, others don't. The common \w \a \l \u classes only contain ASCII letters, so even umlaut characters aren't included in them, leading to unexpected behavior! See also https://unix.stackexchange.com/a/60600/18876.

    In the 'isprint' option (and 'iskeyword', which determines what motions like w move over), multi-byte characters 256 and above are always included, only extended ASCII characters up to 255 are specified with this option.