Search code examples
rubyregexlookbehind

Regex negative lookbehinds with a wildcard


I'm trying to match some text if it does not have another block of text in its vicinity. For example, I would like to match "bar" if "foo" does not precede it. I can match "bar" if "foo" does not immediately precede it using negative look behind in this regex:

/(?<!foo)bar/

but I also like to not match "foo 12345 bar". I tried:

/(?<!foo.{1,10})bar/

but using a wildcard + a range appears to be an invalid regex in Ruby. Am I thinking about the problem wrong?


Solution

  • You are thinking about it the right way. But unfortunately lookbehinds usually have be of fixed-length. The only major exception to that is .NET's regex engine, which allows repetition quantifiers inside lookbehinds. But since you only need a negative lookbehind and not a lookahead, too. There is a hack for you. Reverse the string, then try to match:

    /rab(?!.{0,10}oof)/
    

    Then reverse the result of the match or subtract the matching position from the string's length, if that's what you are after.

    Now from the regex you have given, I suppose that this was only a simplified version of what you actually need. Of course, if bar is a complex pattern itself, some more thought needs to go into how to reverse it correctly.

    Note that if your pattern required both variable-length lookbehinds and lookaheads, you would have a harder time solving this. Also, in your case, it would be possible to deconstruct your lookbehind into multiple variable length ones (because you use neither + nor *):

    /(?<!foo)(?<!foo.)(?<!foo.{2})(?<!foo.{3})(?<!foo.{4})(?<!foo.{5})(?<!foo.{6})(?<!foo.{7})(?<!foo.{8})(?<!foo.{9})(?<!foo.{10})bar/
    

    But that's not all that nice, is it?