I'm using a Java regex pattern in an application that only allows access to the whole match value (that is, I cannot use capturing groups).
I am trying to extract values from my sample text:
C02 SURVEY : 2010 F10446P BONAPARTE 2D
In the above example I need to check for the keyword SURVEY
and have to extract value after that :
. And I wanted my output to be:
2010 F10446P BONAPARTE 2D
I used the pattern (?<=(?i)survey\s{2}[:])(?:(?![\n]).)*
In this pattern, I have hardcoded the spaces to be 2 (\s{2}
) which may vary and not constant value.
I need to use quantifiers with lookbehind operation.
If any other option is there please let me know.
You may leverage a feature in a Java regex engine that is called "constrained width lookbehind":
Java accepts quantifiers within lookbehind, as long as the length of the matching strings falls within a pre-determined range. For instance,
(?<=cats?)
is valid because it can only match strings of three or four characters. Likewise,(?<=A{1,10})
is valid.
That means, you may replace the {2}
limiting quantifier with a limiting quantifier with both minimum and maximum values, e.g. {0,100}
to allow zero to a hundred whitespace symbols. Adjust them as you see fit.
Besides, you needn't use a tempered greedy token (?:(?![\n]).)*
as the dot in Java regex does not match a newline. Just replace it with .*
to match any zero or more chars other than newline. So, your pattern might look as simple as (?i)(?<=survey\s{0,100}:).*
.