Search code examples
c#regexregex-group

Replace all occurrences of the Tab character within double quotes


In the end, I will want to replace all the \t that are enclosed within " I'm currently on Regex101 trying various iterations of my regex... This is the the closest I have so far...

originString = blah\t\"blah\tblah\"\t\"blah\"\tblah\tblah\t\"blah\tblah\t\tblah\t\"\t\"\tbleh\"
regex = \t?+\"{1}[^"]?+([\t])?+[^"]?+\"
\t?+       maybe one or more tab
\"{1}      a double quote
[^"]?+     anything but a double quote
([\t])?+   capture all the tabs
[^"]?+     anything but a double quote
\"{1}      a double quote

My logic is flawed! I need your help in grouping the tab characters.


Solution

  • Match the double quoted substrings with a mere "[^"]+" regex (if there are no escape sequences to account for) and replace the tabs inside the matches only inside a match evaluator:

    var str = "A tab\there \"inside\ta\tdouble-quoted\tsubstring\" some\there";
    var pattern = "\"[^\"]+\""; // A pattern to match a double quoted substring with no escape sequences
    var result = Regex.Replace(str, pattern, m => 
            m.Value.Replace("\t", "-")); // Replace the tabs inside double quotes with -
    Console.WriteLine(result);
    // => A tab here "inside-a-double-quoted-substring" some    here
    

    See the C# demo