Search code examples
regexregex-lookaroundsword-boundary

Regex negative lookahead and word boundary removes first character from capture group


I am trying to capture every word in a string except for 'and'. I also want to capture words that are surrounded by asterisks like *this*. The regex command I am using mostly works, but when it captures a word with asterisks, it will leave out the first one (so *this* would only have this* captured). Here is the regex I'm using:

/((?!and\b)\b[\w*]+)/gi

When I remove the last word boundary, it will capture all of *this* but won't leave out any of the 'and' s.


Solution

  • The problem is that * is not treated as a word character, so \b don't match a position before it. I think you can replace it with:

    ^(?!and\b)([\w*]+)|((?!and\b)(?<=\W)[\w*]+)
    

    The \b was repleced with \W (non-word character) to match also *, however then the first word in string will not match because is not precedeed by non-word character. This is why I added alternative.

    DEMO