Search code examples
c#regexdouble-quotes

Regex Remove pair of double quotes around word, but not single instances of double quotes


I need to be able to remove a pair of double quotes around words, without removing single instances of double quotes.

Ie. in the below examples, the regex should only match around "hello" and "bounce", without removing the word itself.


3.5" hdd

"hello"

"cool

"bounce"

single sentence with out quotes.


Closest regex i've found so far is this one below, but this highlights the entire "bounce" word which is not acceptable as I need to retain the word.

"([^\\"]|\\")*"

Other close regex I've found in my research:

1.

\"*\"

but this highlights the single quotes.

and Unsuccessful Method 2

This needs to be usable in C# code.

I've been using RegexStorm to test my regex: http://regexstorm.net/reference


Solution

  • Your first regex seems fine but lacks an outer capturing group. It would be better if we transform this into a linear regex, avoiding alternation.

    "([^\\"\r\n]*(?:\\.[^\\"\r\n]*)*)"
    

    I included carriage return \r and \n in character class to prevent regex from going more than one line in regex, you may not need them however. You then replace whole match with $1 (a back-reference to first capturing group saved data). To escape a " in C# use double quote "".

    Live demo

    C# code:

    string pattern = @"""([^\\""\r\n]*(?:\\.[^\\""\r\n]*)*)""";
    string input = @"3.5"" hdd
        ""hello""
        ""cool
        ""bounce""
        single sentence with out quotes.";
    
    Regex regex = new Regex(pattern);
    Console.WriteLine(regex.Replace(input, @"$1"));