Search code examples
javascriptregex

Why do negated character classes in JavaScript Regular Expressions traverse newlines even with multiline mode disabled


Encountered a strange behaviour with negated charatcer classes traversing newlines without m/multiline provided.

> node
Welcome to Node.js v22.7.0.
Type ".help" for more information.
> 'abc\nabc\nabc\n'.replace(/b[^z]+/g, '')
'a'
> 'abc\nabc\nabc\n'.replace(/b[^z\n]+/g, '')
'a\na\na\n'

I expected that the first result would only be the case when the m multiline flag is enabled:

> 'abc\nabc\nabc\n'.replace(/b[^z]+/gm, '')
'a'

Is this a bug, or is this expected? If it is expected, what is the reasoning?

I was able to work around it with this usage of ?$ at the end:

> 'abc\nabc\nabc\n'.replace(/b[^z]+?$/g, '')
'a'
> 'abc\nabc\nabc\n'.replace(/b[^z]+?$/gm, '')
'a\na\na\n'

Solution

  • From the documentation you linked to

    ... if m is used, ^ and $ change from matching at only the start or end of the entire string to the start or end of any line within the string.

    You aren't using these in your regular expression so I wouldn't expect to see a difference in behaviour when removing the multiline option.