Search code examples
javaregexregular-language

Why does this string match even though parts of it are in negative lookahead


Why does this regex

(?!(public|private|protected|abstract|final|static))(.*)\(\);

match this string:

public Test();

even though I said that I want an empty space, then no (public|...) and then some stuff and then ();? What am I doing wrong here?

But I want to match this: Test();

Edit:

Why does this not work (no matches found), even though regex101.com matches Test();?

final String test = "public class Test {\n" +
        "  Test();\n";

final Pattern ptrn = Pattern.compile("^  (?!(public|private|protected|abstract|final|static))(.*)\\(\\);");
final Matcher mtchr = ptrn.matcher(test);

while(mtchr.find())
    System.out.println("FOUND");

Solution

  • The Problem

    The issue you're having is shown in the snippet below. The regex is matching because your regex is not anchored to a location in the string. What this means is that the regex will try to match at each location in the string.

    The regex will attempt first to match at p in public Test();. Since the negative lookahead includes public it will fail and try the next location: u in ublic Test();. Since public and the other terms don't exist at this location, it matches successfully!

    var s = 'public Test();'
    var r = /(?!(public|private|protected|abstract|final|static))(.*)\(\);/
    
    console.log(s.match(r)[0])

    The Fix

    So how do we fix this? Simple: Anchor the pattern. Adding the ^ anchor forces the regex to match from the start of the string. This means it will only attempt to match from the start of the string (the location of p in public Test();) and will not attempt to match wherever the start of the string is not (it will not try to match on the u).

    See regex in use here

    ^(?!(public|private|protected|abstract|final|static))(.*)\(\);
    

    var a = [
      'public Test();',
      'Test();'
    ]
    var r = /^(?!(public|private|protected|abstract|final|static))(.*)\(\);/
    
    a.forEach(function(s) {
      console.log(r.test(s) ? s.match(r)[0] : '*** no match')
    })