Search code examples
regexperlescapingquotessubstitution

Escape all double quotes inside a single quoted string with Regex


Possible Duplicate:
Regular Expression to escape double quotes inside single quotes

I need a regex (no other language!!, best would be perl syntax REGEX or PCRE syntax REGEX) to replace all double quotes " with a \" that are inside a single quoted string. This is an example string (part of a file):

var baseUrl = $("#baseurl").html();
var head = '<div id="finishingDiv" style="background-image:url({baseUrl}css/userAd/images/out_main.jpg); background-repeat: repeat-y; ">'+
'<div id="buttonbar" style="width:810px; text-align:right">';

(Be aware: They dont have to be paired "someValueBetween" so its possible that there are uneven numbers of double quotes in one single quoted string.)

This should be the end result for the last line above:

'<div id=\"buttonbar\" style=\"width:810px; text-align:right\">';

Thanks in advance

***Update: To make it clear, i want a regular expression only, not a perl programm. The regular expression can be perl regex syntax or PHP PCRE syntax (which is a very close syntax to the perl regex syntax from what i understand). Goal is that you can run the regex in IDES in the search and replace menus that support regex's (like Eclipse and PhpEd f.e )!!

In other words, i want a regex that i will put in the search IDE field that gives me exactly all unescaped " in the single quoted string as a result. In the replace field of eclipse i can then just put \$1 to escape them.

They should work in Regexbuddy or regex coach please so i can test them.

At least that is the plan :)



Solution

  • You asked for Perl (or PCRE) and nothing else.

    Ok.

    If you just want to escape unescaped double quotes no matter where you find them, do this:

      s{
          (?<! (?<! \\ ) \\{1} )
          (?<! (?<! \\ ) \\{3} )
          (?<! (?<! \\ ) \\{5} )
          (?<! (?<! \\ ) \\{7} )
          (?= " )
      }{\\}xg;
    

    If you want to escape unescaped double quotes between unescaped single quotes, and you only have one pair of such single quotes, do this:

    1 while s{
    
      (?(DEFINE)
    
        (?<unescaped>
          (?<! (?<! \\ ) \\{1} )
          (?<! (?<! \\ ) \\{3} )
          (?<! (?<! \\ ) \\{5} )
          (?<! (?<! \\ ) \\{7} )
        )
    
        (?<single_quote> (?&unescaped) ' )
        (?<double_quote> (?&unescaped) " )
        (?<unquoted>     [^'] *?          )
    
      )
    
      (?<HEAD>
        (?&single_quote)
        (?&unquoted)
      )
    
      (?<TAIL>
        (?&double_quote)
        (?&unquoted)
        (?&single_quote)
    
      )
    
    }<$+{HEAD}\\$+{TAIL}>xg;
    

    But if you may have multiple sets of paired unescaped single quotes per line, and you only want to escape the unescaped double quotes that fall between those unescaped single quotes, then do this:

    sub escape_quote {
      my $_ = shift;
      s{
          (?<! (?<! \\ ) \\{1} )
          (?<! (?<! \\ ) \\{3} )
          (?<! (?<! \\ ) \\{5} )
          (?<! (?<! \\ ) \\{7} )
          (?= " )
      }{\\}xg;
    
      return $_;
    }
    
    s{
    
      (?(DEFINE)
    
        (?<unescaped>
          (?<! (?<! \\ ) \\{1} )
          (?<! (?<! \\ ) \\{3} )
          (?<! (?<! \\ ) \\{5} )
          (?<! (?<! \\ ) \\{7} )
        )
    
        (?<single_quote> (?&unescaped) ' )
        (?<unquoted>     [^'] *?          )
    
      )
    
      (?<HEAD> (?&single_quote) )
      (?<TARGET> (?&unquoted) )
      (?<TAIL> (?&single_quote) )
    
    }{
                   $+{HEAD}    .
      escape_quote($+{TARGET}) .
                   $+{TAIL}
    
    }xeg;
    

    Note that this all presupposed you have no legitimate paired unescaped double quotes containing unescaped single quotes. Even something like this will throw you off:

    my $cute = q(') . "stuff" . q(');
    

    Probably, though, you want to use a proper parsing module.

    Please pay no attention to all the garish and deceitfully incorrect SO coloring. For some reason, it doesn't seem to be able to parse Perl as well as perl does. Can't imagine why. ☺