Search code examples
javaregexlookbehind

Java regex lookbehind


I want to match a string that has "json" (occurs more than 2 times) and without string "from" between two "json".

For example(what I want the string match or not):
select json,json from XXX -> Yes
select json from json XXXX -> No
select json,XXXX,json from json XXX -> Yes

Why the third is matching because I just want two "json" string occurs without "from" inside between it. After learning regex lookbehind, I'm write the regex like this:

select.*json.*?(?<!from)json.*from.*

I'm using regex lookbehind to except the from string.

But after test, I find this regex match the string "select get_json_object from get_json_object" too.

What wrong for my regex? Any suggestion is appreciated.


Solution

  • You need to use tempered greedy token for achieving this. Use this regex,

    \bjson\b(?:(?!\bfrom\b).)+\bjson\b
    

    This expression (?:(?!\bfrom\b).)+ will match any text that does not contain from as a whole word inside it.

    Regex Demo

    For matching the whole line, you can use,

    ^.*\bjson\b(?:(?!\bfrom\b).)+\bjson\b.*$
    

    Like you wanted in your post, this regex will match the line as long as it finds a string where a from does not appear between two jsons

    Regex Demo with full line match

    Edit: Why OP's regex select.*json.*?(?<!from)json.*from.* didn't work as expected

    Your regex starts matching with select and then .* matches as much as possible, while making sure it finds json ahead followed by some optional characters and then again expects to find a json string then .* matches again some characters then expects to find a from and finally using .* zero or more optional characters.

    Let's take an example string that should match.

    select json from json json XXXX
    

    It has two json string without from in between so it should match but it doesn't, because in your regex, the order or presence of json and from is fixed which is json then again json then from which is not the case in this string.

    Here is a Java code demo

    List<String> list = Arrays.asList("select json,json from XXX","select json from json XXXX","select json,json from json XXX","select json from json json XXXX");
    
    list.forEach(x -> {
        System.out.println(x + " --> " + x.matches(".*\\bjson\\b(?:(?!\\bfrom\\b).)+\\bjson\\b.*"));
    });
    

    Prints,

    select json,json from XXX --> true
    select json from json XXXX --> false
    select json,json from json XXX --> true
    select json from json json XXXX --> true