Search code examples
javascriptregexunicodeucp

Problem with \b in cyrillic (JavaScript Regex)


I have this regex: (*UCP).*\bпроверка\b.*. And it works well on the regex101.com (https://regex101.com/r/9elF5c), but not in JavaScript.

const regex = /(*UCP).*\bпроверка\b.*/
console.log(regex.test('а проверка б'))

Can someone please explain what the problem is and how to fix it


Solution

  • Using (*UCP) is a modifier supported by PCRE.

    The error in Javascript is because this syntax does not work (* The parenthesis is a special char and the * is a quantifier.

    If the string should have whitespace boundaries on the left and right:

    .*(?<!\S)проверка(?!\S).*
    

    const regex = /.*(?<!\S)проверка(?!\S).*/
    console.log(regex.test('а проверка б'))