Search code examples
regexregexbuddy

How to capture text between XML Summary?


I have single-line and multi-line XML Summary texts, that look like these.

/// <summary> This is a single-line XML comment. </summary> 

/// <summary> This is a multi-line XML comment.
/// These are additional lines with more text.
/// Some more of these text. </summary>

/// <summary> This is another XML text summary with a different
/// format.
/// </summary>

In RegexBuddy, how would I capture the texts within, without the /// and the <summary> </summary> tags?

I came up with the following to capture a multi-line XML summary:

  ((\s*(///)\s*((<summary>)?))(.*))+(</summary>)$

and a single XML summary:

  \s*///\s*(<summary>).*(</summary>)$

But I've no idea how to capture just the text.

What would be the regular expression I would use, in order to capture just the text, so that I can use it in a replacement reference?

Thank you in advance.


Solution

  • Use the PCRE engine:

    (?:^///\s*(?:<summary>)?|</summary>)(*SKIP)(*F)|(?:(?!</?summary>|^///(?!/)\s*).)+
    

    See proof

    Explanation

    --------------------------------------------------------------------------------
      (?:                      group, but do not capture:
    --------------------------------------------------------------------------------
        ^                        the beginning of the string
    --------------------------------------------------------------------------------
        ///                      '///'
    --------------------------------------------------------------------------------
        \s*                      whitespace (\n, \r, \t, \f, and " ") (0
                                 or more times (matching the most amount
                                 possible))
    --------------------------------------------------------------------------------
        (?:                      group, but do not capture (optional
                                 (matching the most amount possible)):
    --------------------------------------------------------------------------------
          <summary>                '<summary>'
    --------------------------------------------------------------------------------
        )?                       end of grouping
    --------------------------------------------------------------------------------
       |                        OR
    --------------------------------------------------------------------------------
        </summary>               '</summary>'
    --------------------------------------------------------------------------------
      )                        end of grouping
    --------------------------------------------------------------------------------
      (*SKIP)                     'SKIP' verb, skips the match
    --------------------------------------------------------------------------------
      (*F)                        'FAIL' verb, triggers fail and backtracking
    --------------------------------------------------------------------------------
     |                        OR
    --------------------------------------------------------------------------------
      (?:                      group, but do not capture (1 or more times
                               (matching the most amount possible)):
    --------------------------------------------------------------------------------
        (?!                      look ahead to see if there is not:
    --------------------------------------------------------------------------------
          <                        '<'
    --------------------------------------------------------------------------------
          /?                       '/' (optional (matching the most
                                   amount possible))
    --------------------------------------------------------------------------------
          summary>                 'summary>'
    --------------------------------------------------------------------------------
         |                        OR
    --------------------------------------------------------------------------------
          ^                        the beginning of the string
    --------------------------------------------------------------------------------
          ///                      '///'
    --------------------------------------------------------------------------------
          (?!                      look ahead to see if there is not:
    --------------------------------------------------------------------------------
            /                        '/'
    --------------------------------------------------------------------------------
          )                        end of look-ahead
    --------------------------------------------------------------------------------
          \s*                      whitespace (\n, \r, \t, \f, and " ")
                                   (0 or more times (matching the most
                                   amount possible))
    --------------------------------------------------------------------------------
        )                        end of look-ahead
    --------------------------------------------------------------------------------
        .                        any character except \n
    --------------------------------------------------------------------------------
      )+                       end of grouping