Search code examples
.netregexcommentsstrip

Strip Comments from XML


I've encountered the need to remove comments of the form:

<!--  Foo

      Bar  -->

I'd like to use a regular expression that matches anything (including line breaks) between the beginning and end 'delimiters.'

What would a good regex be for this task?


Solution

  • The simple way :

    Regex xmlCommentsRegex = new Regex("<!--.*?-->", RegexOptions.Singleline | RegexOptions.Compiled);
    

    And a better way :

    Regex xmlCommentsRegex = new Regex("<!--(?:[^-]|-(?!->))*-->", RegexOptions.Singleline | RegexOptions.Compiled);