Search code examples
c#.netregexcapturing-group

Regex: How to capture all iterations in repeated capturing group


I would expect these lines of C#:

var regex = new Regex("A(bC*)*");
var match = regex.Match("AbCCbbCbCCCCbbb");
var groups = match.Groups;

to return something like:

["AbCCbbCbCCCCbbb", "A", "bCC", "b", "bC", "bCCC", "b", "b", "b"]

but instead it returns only the last captured match:

["AbCCbbCbCCCCbbb", "b"]

Here Regex101 also displays the following as a warning:

A repeated capturing group will only capture the last iteration. Put a capturing group around the repeated group to capture all iterations or use a non-capturing group instead if you're not interested in the data

How should I change my regex pattern?


Solution

  • If you want to also capture A, just wrap it with parentheses: new Regex("(A)(bC*)*"). See the regex demo.

    enter image description here

    Then, collect all the values you have got inside CaptureCollection:

    var regex = new Regex("(A)(bC*)*");
    var match = regex.Matches("AbCCbbCbCCCCbbb")
         .Cast<Match>()
         .SelectMany(x => x.Groups.Cast<Group>()
              .SelectMany(v => v.Captures
                  .Cast<Capture>()
                  .Select(t => t.Value)
              )
         )
         .ToList();
     foreach (var s in match)
         Console.WriteLine(s);
    

    See the C# demo