Search code examples
c#regexcapturing-groupregex-group

C# Regex capturing group not working


In the following code I want to capture anything that begins with test and followed by the text enclosed by double quotes. E.g.

test"abc"

test"rst"

The code works fine.

private void testRegex()
{
    string st = "this test\"abc\"= or test\"rst\"\"uvw\" or test(def)(abc) is a test.";
    Regex oRegex = new Regex("test\".*?\"");

    foreach (Match mt in oRegex.Matches(st))
    {
        Console.WriteLine(mt.Value);
    }
}

Then, from the above captures, I want to capture the subexpressions that follow the word test (in above examples those subexpressions would be "abc" and "rst" including the ". I tried the following and it correctly gives me:

"abc"

"rst"

private void testRegex()
    {
        string st = "this test\"abc\"= or test\"rst\"\"uvw\" or test(def)(abc) is a test.";
        Regex oRegex = new Regex("test(\".*?\")");

        foreach (Match mt in oRegex.Matches(st))
        {
            Console.WriteLine(mt.Groups[1].Value);
        }
    }

Question: Now I want to capture the two subexpressions 1. "abc" and "rst" 2. Any character except " that follows the matches test"abc" and test"rst". So, I tried the following but as shown below the groups 1 and 2 for the match "rst""uvw" are wrong. I need group 1 of "rst""uvw" to be "rst" and group 2 to be empty since the character that follows "rst" is ":

Group 1: "abc"

Group 2: =

Group 1: "rst""

Group 2: u

private void testRegex()
        {
            string st = "this test\"abc\"= or test\"rst\"\"uvw\" or test(def)(abc) is a test.";
            Regex oRegex = new Regex("test(\".*?\")([^\"])");

            foreach (Match mt in oRegex.Matches(st))
            {
                Console.WriteLine(mt.Groups[1].Value);
                Console.WriteLine(mt.Groups[2].Value);
            }
        }

Solution

  • You must be looking for

    test("[^"]*")([^"])?
    

    See demo

    I made 2 changes:

    • Used negated character class [^"]* (matching 0 or more characters other than a double quote) instead of lazy matching any characters with .*?
    • Made the [^"] optional with ? quantifier.

    enter image description here