Search code examples
c#regexescapingdouble-quotesregexp-replace

Find quoted strings and replace content between double quotes


I have a string, e.g.

"24.09.2019","545","878","5"

that should be processed to

"{1}","{2}","{3}","{4}"

Now I am trying to use regular expression:

string replacementString="{NG}";
Regex regex = new Regex("\\\"[0-9\.]+\\\"");    
MatchCollection matches = regex.Matches(originalString);

List<string> replacements = new List<string>();

for (int x = 0; x < matches.Count; x++)    
{    
    string replacement = String.Copy(replacementString);    
    replacement = replacement.Replace("NG", (x + 1).ToString());    
    replacements.Add(replacement);
    Match match = matches[x];
}

replacements.Reverse();

int cnt = 0;    

foreach (var match in matches.Cast<Match>().Reverse())    
{    
    originalStringTmp = originalStringTmp.Replace(
        match.Index, 
        match.Length, 
        replacements[cnt]);    

    cnt++;    
}

And

public static string Replace(this string s, int index, int length, string replacement)
{
    var builder = new StringBuilder();

    builder.Append(s.Substring(0, index));
    builder.Append(replacement);
    builder.Append(s.Substring(index + length));

    return builder.ToString();
 }

But in this case the result is

{1},{2},{3},{4}

What regular expression should I use instead of

\"[0-9\.]+\"

to achieve the result

"{1}","{2}","{3}","{4}"

with C# regular expression?


Solution

  • Let's try Regex.Replace in order to replace all the quotations (I've assumed that quotation is escaped by itself: "abc""def" -> abc"def) within the string:

      string source = "\"24.09.2019\",\"545\",\"878\",\"5\"";
    
      int index = 0;
    
      string result = Regex.Replace(source, "\"([^\"]|\"\")*\"", m => $"\"{{{++index}}}\"");
    

    Demo:

      Func<string, string> convert = (source => {
        int index = 0;
    
        return Regex.Replace(source, "\"([^\"]|\"\")*\"", m => $"\"{{{++index}}}\"");
      });
    
      String[] tests = new string[] {
        "abc",
        "\"abc\", \"def\"\"fg\"",
        "\"\"",
        "\"24.09.2019\",\"545\",\"878\",\"5\"",
        "name is \"my name\"; value is \"78\"\"\"\"\"",
        "empty: \"\" and not empty: \"\"\"\""
      };
    
      string demo = string.Join(Environment.NewLine, tests
        .Select(test => $"{test,-50} -> {convert(test)}"));
    
      Console.Write(demo);
    

    Outcome:

    abc                                                -> abc
    "abc", "def""fg"                                   -> "{1}", "{2}"
    ""                                                 -> "{1}"
    "24.09.2019","545","878","5"                       -> "{1}","{2}","{3}","{4}"
    name is "my name"; value is "78"""""               -> name is "{1}"; value is "{2}"
    empty: "" and not empty: """"                      -> empty: "{1}" and not empty: "{2}"
    

    Edit: You can easily elaborate the replacement, e.g. if you want to replace integer numbers only

      Func<string, string> convert = (source => {
        int index = 0;
    
        // we have match "m" with index "index"
        // out task is to provide a string which will be put instead of match
        return Regex.Replace(
          source, 
         "\"([^\"]|\"\")*\"", 
          m => int.TryParse(m.Value.Trim('"'), out int _drop)
            ? $"\"{{{++index}}}\"") // if match is a valid integer, replace it
            : m.Value);             // if not, keep intact 
      });
    

    In general case

      Func<string, string> convert = (source => {
        int index = 0;
    
        // we have match "m" with index "index"
        // out task is to provide a string which will be put instead of match
        return Regex.Replace(
          source, 
         "\"([^\"]|\"\")*\"", 
          m => {
            // now we have a match "m", with its value "m.Value"
            // its index "index" 
            // and we have to return a string which will be put instead of match
    
            // if you want unquoted value, i.e. abc"def instead of "abc""def"
            // string unquoted = Regex.Replace(
            //   m.Value, "\"+", match => new string('"', match.Value.Length / 2)); 
    
            return //TODO: put the relevant code here
          }
      });