I'd like to port a generic text processing tool, Texy!, from PHP to Java.
This tool does ungreedy matching everywhere, using preg_match_all("/.../U")
.
So I am looking for a library, which has some UNGREEDY
flag.
I know I could use the .*?
syntax, but there are really many regular expressions I would have to overwrite, and check them with every updated version.
I've checked
Is there any such library?
Thanks, Ondra
I suggest you create your own modified Java library. Simply copy the java.util.regex source into your own package.
The Sun JDK 1.6 Pattern.java class offers these default flags:
static final int GREEDY = 0;
static final int LAZY = 1;
static final int POSSESSIVE = 2;
You'll notice that these flags are only used a couple of times, and it would be trivial to modify. Take the following example:
case '*':
ch = next();
if (ch == '?') {
next();
return new Curly(prev, 0, MAX_REPS, LAZY);
} else if (ch == '+') {
next();
return new Curly(prev, 0, MAX_REPS, POSSESSIVE);
}
return new Curly(prev, 0, MAX_REPS, GREEDY);
Simply change the last line to use the 'LAZY' flag instead of the GREEDY flag. Since your wanting a regex library to behave like the PHP one, this might be the best way to go.