I'm looking to extract (using Java's built in Regex at the moment) text after a range of suffixes. I'm using the lookbehind technique but the result I get always seems to be the longest result rather than the match of the first alternation group to match the prefix text.
That is,
(?<=Book name|Book).*
Given the text "Book name Story"
The match is always "name Story"
regardless of which way round the regex alternation is.
My question here is what is the best way to get just the "Story"
text without match any of the other text?
In practice I'm hoping to limit the right hand side too with a lookahead(just in case that's pertinent).
You could use a lookahead here.
(?<=Book name |Book )\S+(?=$)
OR
(?<=Book name )\S+|(?<=Book )(?!name)\S+
Java regex would be,
"(?<=Book name |Book )\\S+(?=$)"
OR
"(?<=Book name )\\S+|(?<=Book )(?!name)\\S+"
Code:
String s = "Book name Story";
Pattern regex = Pattern.compile("(?<=Book name |Book )\\S+(?=$)");
Matcher regexMatcher = regex.matcher(s);
if (regexMatcher.find()) {
String ResultString = regexMatcher.group();
System.out.println(ResultString);
}//=> Story
Explanation:
(?<=Book name |Book )
Looknbehind sets the matching marker just after to the string Book name
or Book
.\S+
Matches one or more non-space characters.(?=$)
Lookahead asserts what following must be a line end.