Search code examples
phpregexconcatenationpcrelookbehind

Is it possible to AND two separate lookaround/zero-width assertions (i.e. lookbehind/look-behind) in a regular expression?


I'm using Perl for this regular expression question, but it would be good to know if it applies to PHP, as well.

I need to comment out all print statements or all things that start with print in a PHP file. It looks something like this:

<?php
    // Description of file
    ...
    print("Foobar");
    // print("Foo");
    //print("bar");
    // Open and print file
    function printtemplate($file) {
    ...
    }
    ...
    printtemplate($file);
    ...  
?>

To start with, I formulated a regular expression like this:

((?<!function )|(?<!//))print

It obviously does not work because the | is an OR. I'm looking for an AND so that both negative look-behind assertions need to be true. Does the AND construct exist in some form in regular expressions or is there a way to simulate one?

Ultimately, I want the php file to look like the following, after the regular expression is applied:

<?php
    // Description of file
    ...
    //print("Foobar");
    // print("Foo");
    //print("bar");
    // Open and print file
    function printtemplate($file) {
    ...
    }
    ...
    //printtemplate($file);
    ...  
?>

Any help would be appreciated. Thank you.


Solution

  • Just put them next to each other. That's it. It will create AND effect, since you need to pass both look-around before being able to match anything that comes after them.

    In your case, it would be:

    (?<!function )(?<!//)print
    

    However, note that the regex above will return false positive, which causes more comments to be added than necessary. Demo.

    For PCRE (used in PHP), look-behind assertion requires the pattern to be strictly fixed-length, so it is not possible to use look-behind assertion to check in all cases whether print is being commented out or not to exclude it. @mpapec's answer gives one solution that is applicable for well-written code, and has better coverage than your regex with look-behind.