Search code examples
javascripthtmlregexcomments

JavaScript Regex to select multi-line html comment


Suppose I have a HTML comment like this:

<!--hello world-->

We can get the get the comment with this regex:

var expression = /<!--(.*)-->/g

Working example:

var expression = /<!--(.*)-->/g;

console.log(document.querySelector("div").innerHTML.match(expression)[0]);
<div>
  <!--hello world-->
</div>

But if the comment is multi-line, like this:

<!--hello
world-->

Then the regex doesn't work.

var expression = /<!--(.*)-->/g;

console.log(document.querySelector("div").innerHTML.match(expression)[0]); // Will throw an error because it could not find any matches
<div>
  <!--hello
  world-->
</div>

How can I select the multi-line HTML comment?


Solution

  • Use pattern [\s\S]* instead of .* to catch all chars, including whitespace. Alternatively, use the m flag if you want the .* match also newlines.

    Use non-greedy pattern [\s\S]*? if you expect more than one <!--...--> pattern.

    Working test case:

    var expression = /<!--[\s\S]*?-->/g;
    
    console.log(document.querySelector("div").innerHTML.match(expression));
    <div>
      <!--hello world 1-->
      <!--hello
      world 2-->
    </div>

    Output:

    [
      "<!--hello world 1-->",
      "<!--hello\n  world 2-->"
    ]