According to this question, there is a big difference between find
and matches()
, still both provide results in some form.
As a kind of Utility the toMatchResult
function returns with the current results of the matches()
operation. I hope my assumption under (1)
is valid. (regex is here)
String line = "aabaaabaaabaaaaaab";
String regex = "(a*b)a{3}";
Matcher matcher = Pattern.compile(regex).matcher(line);
matcher.find();
// matcher.matches();(1) --> returns false because the regex doesn't match the whole string
String expectingAab = matcher.group(1);
System.out.println("actually: " + expectingAab);
Unfortunately the following in no way works ( Exception: no match found ):
String line = "aabaaabaaabaaaaaab";
String regex = "(a*b)a{3}";
String expectingAab = Pattern.compile(regex).matcher(line).toMatchResult().group(1);
System.out.println("actually: " + expectingAab);
Why is that? My first assupmtion was that it doesn't work because the regex should match the whole string; but the same exceptio is being thrown with the string value aabaaa
as well...
Of course the matcher needs to be set to the correct state with find()
, but what if I'd like to use a oneliner for it? I actually implemented a utility calss for this:
protected static class FindResult{
private final Matcher innerMatcher;
public FindResult(Matcher matcher){
innerMatcher = matcher;
innerMatcher.find();
}
public Matcher toFindResult(){
return innerMatcher;
}
}
public static void main(String[] args){
String line = "aabaaabaaabaaaaaab";
String regex = "(a*b)a{3}";
String expectingAab = new FindResult(Pattern.compile(regex).matcher(line)).toFindResult().group(1);
System.out.println("actually: " + expectingAab);
}
I know full well that this is not an optimal solution to create a oneliner, especially because it puts heavy loads to the garbage collector..
Is there an easier, better solution for this?
It's worth noting, that I'm looking for a solution java8. The matching logic works differently above java 9.
The toMatchResult()
method returns the state of the previous match operation, whether it was find()
, lookingAt()
, or matches()
.
Your line
String expectingAab = Pattern.compile(regex).matcher(line).toMatchResult().group(1);
does not invoke any of those methods, hence, will never have a previous match and always produce a IllegalStateException: No match found
.
If you want a one-liner to extract the first group of the first match, you could simply use
String expectingAab = line.replaceFirst(".*?(a*b)a{3}.*", "$1");
The pattern needs .*?
before and .*
after the actual match pattern, to consume the remaining string and only leave the first group as its content. The caveat is that if no match exists, it will evaluate to the original string.
So if you want matches
rather than find
semantic, you can use
String expectingNoMatch = line.replaceFirst("^(a*b)a{3}$", "$1");
which will evaluate to the original string with the example input, as it doesn’t match.
If you want your utility method not to create a FindResult
instance, just use a straight-forward static
method.
However, this is a typical case of premature optimization. The Pattern.compile
invocation creates a Pattern
object, plus a bunch of internal node objects representing the pattern elements, the matcher
invocation creates a Matcher
instance plus arrays to hold the groups, and the toMatchResult
invocation creates another object instance, and of course, the group(1)
invocation unavoidably creates a new string instance representing the result.
The creation of the FindResult
instance is the cheapest in this row. If you care for performance, you keep the result of Pattern.compile
if you use the pattern more than once, as that’s the most expensive operation and the Pattern
instance is immutable and shareable, as explicitly stated in its documentation.
Of course, the string methods replaceFirst
and replaceAll
do no magic, but perform the same steps under the hood.