Search code examples
javascriptregexsizzle

How do the regular expressions in sizzle.js work?


If escaped characters in a regular expression created in JavaScript with the RegExp object need to be escaped again how does the following code in sizzle.js work -

identifier = "(?:\\\\.|[\\w-]|[^\0-\\xa0])+"

If \\\\\\\ = \ and \\\w = \w then how does \0 = \0 when only a single backslash is used?

When run in Google console identifier is "(?:\\\\.|[\w-]|[^-\\xa0])+"

Is this a mistake or am I not understanding correctly? If this is correct and this is how it is intended to work what is the purpose of \0?


Solution

  • If your regular expression needs to contain a backslash — e.g., because you need something like \( (which matches an actual () or \w (which matches a letter or digit or underscore) — and you're creating the regular expression from a string literal, then you need to write \\, which ends up as \ in the regular expression.

    But in your \0 example, the regular expression doesn't need to contain a backslash. It just needs to contain the character U+0000 (which matches itself). So the string literal can just contain \0, which ends up as the character U+0000.