Search code examples
javascriptregexword-boundaryboundaryword-boundaries

What are non-word boundary in regex (\B), compared to word-boundary?


What are non-word boundary in regex (\B), compared to word-boundary?


Solution

  • A word boundary (\b) is a zero width match that can match:

    • Between a word character (\w) and a non-word character (\W) or
    • Between a word character and the start or end of the string.

    In Javascript the definition of \w is [A-Za-z0-9_] and \W is anything else.

    The negated version of \b, written \B, is a zero width match where the above does not hold. Therefore it can match:

    • Between two word characters.
    • Between two non-word characters.
    • Between a non-word character and the start or end of the string.
    • The empty string.

    For example if the string is "Hello, world!" then \b matches in the following places:

     H e l l o ,   w o r l d !
    ^         ^   ^         ^ 
    

    And \B matches those places where \b doesn't match:

     H e l l o ,   w o r l d !
      ^ ^ ^ ^   ^   ^ ^ ^ ^   ^