Search code examples
rubyregexnegative-lookbehindrubular

Regex lookahead/lookbehind comments


I have a snippet from a config file that I need to be able to match the specified string quote contents, but only when they're not commented out, here's my current regex:

(?<!=#)test\.this\.regex\s+\"(.*?)\"

I feel like this should work? I read it like this:

(?<!=#) lookbehind to make sure it's not preceded by a #

test\.this\.regex\s+\"(.*?)\" matches test.this.regex "sup1"

Here is the config snippet

    test.this.regex "sup1" hi |sup1| # test.this.regex "sup3" hi |sup3|
# test.this.regex "sup2" do |sup2|
    test.this.regex "sup2" do |sup2|

But my regex matches all 4 times:

Match 1
1.  sup1
Match 2
1.  sup3
Match 3
1.  sup2
Match 4
1.  sup2

Solution

  • You can use this PCRE regex:

    /(?># *(*SKIP)(*FAIL)|(?:^|\s))test\.this\.regex\s+\"[^"]*\"/
    

    Working Demo

    • (*FAIL) behaves like a failing negative assertion and is a synonym for (?!)
    • (*SKIP) defines a point beyond which the regex engine is not allowed to backtrack when the subpattern fails later
    • (*SKIP)(*FAIL) together provide a nice alternative of restriction that you cannot have a variable length lookbehinf in above regex.

    UPDATE: Not sure whether ruby supports (*SKIP)(*FAIL) so giving this alternative version:

    (?:# *test\.this\.regex\s+\"[^"]*\"|\b(test\.this\.regex\s+\"[^"]*\"))
    

    And look for non-empty matched group #1.

    Working Demo 2