Search code examples
regexescapingnegative-lookbehind

How can regex ignore escaped-quotes when matching strings?


I'm trying to write a regex that will match everything BUT an apostrophe that has not been escaped. Consider the following:

<?php $s = 'Hi everyone, we\'re ready now.'; ?>

My goal is to write a regular expression that will essentially match the string portion of that. I'm thinking of something such as

/.*'([^']).*/

in order to match a simple string, but I've been trying to figure out how to get a negative lookbehind working on that apostrophe to ensure that it is not preceded by a backslash...

Any ideas?

- JMT


Solution

  • <?php
    $backslash = '\\';
    
    $pattern = <<< PATTERN
    #(["'])(?:{$backslash}{$backslash}?+.)*?{$backslash}1#
    PATTERN;
    
    foreach(array(
        "<?php \$s = 'Hi everyone, we\\'re ready now.'; ?>",
        '<?php $s = "Hi everyone, we\\"re ready now."; ?>',
        "xyz'a\\'bc\\d'123",
        "x = 'My string ends with with a backslash\\\\';"
        ) as $subject) {
            preg_match($pattern, $subject, $matches);
            echo $subject , ' => ', $matches[0], "\n\n";
    }
    

    prints

    <?php $s = 'Hi everyone, we\'re ready now.'; ?> => 'Hi everyone, we\'re ready now.'
    
    <?php $s = "Hi everyone, we\"re ready now."; ?> => "Hi everyone, we\"re ready now."
    
    xyz'a\'bc\d'123 => 'a\'bc\d'
    
    x = 'My string ends with with a backslash\\'; => 'My string ends with with a backslash\\'