Does anyone have an example of a catastrophic backtracking in a regular expression, which works in Java 11? Most of the usual examples (like "(a+a+)+b") are fixed since java 9. The best would be one without back-references, no idea if it's possible in JDK 9+.
In our application we have a logic to control such backtracking, and to test that logic we used the expression "(x+x+)+y". After upgrading to JDK 11 it no longer causes the need behavior.
I am running java 14 and the backtracking issue still exists. It takes the following about 6 seconds to fail.
String regex = "(a+a+a+a+a+a+)+b";
String str = "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaac";
System.out.println(str.matches(regex));
This can be prevented by using independent group(s) (?>X)
. Once the group matches the first time, the engine checks to see if the following matches the b
. If it doesn't, no need to repeat as backtracking will not change that outcome. So it fails instantly (even for larger expressions and/or test strings).
String regex1 = "(?>(a+a+a+a+a+a+)+)b";
System.out.println(str.matches(regex1));