Search code examples
phpregexsearchpreg-replacelookbehind

Variable length regex lookbehind


My regex is below:

(?<![\s]*?(\"|&quot;)")WORD(?![\s]*?(\"|&quot;))

As you can see, I am trying to match all instances of WORD unless they are inside "quotation marks". So...

WORD <- Find this
"WORD" <- Don't find this
"   WORD   " <- Also don't find this, even though not touching against marks
&quot;WORD&quot;  <- Dont find this (I check &quot; and " so works after htmlspecialchars)

I beleive my regex would work perfectly IF I did not receive the error:

Compilation failed: lookbehind assertion is not fixed length

Is there any way to do what I intend, considering the limitations of lookbehind?

If you can think of any other way let me know.

Many many thanks,

Matthew

p.s. The WORD section will actually contain Jon Grubers URL detector


Solution

  • I would suggest a different approach. This will work as long as the quotes are correctly balanced, because then you know you're inside a quoted string iff the number of quotes that follow is odd, thereby making the lookbehind part unnecessary:

    if (preg_match(
    '/WORD             # Match WORD
    (?!                # unless it\'s possible to match the following here:
     (?:               # a string of characters
      (?!&quot;)       # that contains neither &quot;
      [^"]             # nor "
     )*                # (any length),
     ("|&quot;)        # followed by either " or &quot; (remember which in \1)
     (?:               # Then match
      (?:(?!\1).)*\1   # any string except our quote char(s), followed by that quote char(s)
      (?:(?!\1).)*\1   # twice,
     )*                # repeated any number of times --> even number
     (?:(?!\1).)*      # followed only by strings that don\'t contain our quote char(s)
     $                 # until the end of the string
    )                  # End of lookahead/sx', 
    $subject))