I'm looking to construct a regex which will help me identify the first occurrence of a match.
My current regex is "(.*)[Cc][Aa][Ss][Ee][^a-zA-Z\\d]*(\\d\\d*)[^a-zA-Z\\d]*(.*)"
What I am trying to do is to find if the input string contains the word "case" (case insensitive), followed by any number of special characters, followed by a number;
I want to retrieve 3 parts of the text.
Say my input string is "RE: FW: case:-1234: there is some description"
Using this regex, I am able to retrieve, "RE: FW: "
, "1234"
, "there is some description"
.
This is fine, but if my input string is
"RE: FW: case:-1234: This is in reference to case 789 reopening"
Then my regex returns, "RE: FW: case:-1234: This is in reference to"
, "789"
, "reopening"
.
What I would like to get is "RE: FW: "
, "1234"
, "This is in reference to case 789 reopening"
.
I am a newbie with regex, so any help is much appreciated.
Note: I am working on a java based tool, so java compatible regex would be nice.
Does your regex have to match the entire string (i.e. does it use matches
)? If not (or if you can choose to use find
instead) simply remove the (.*)
, because that's what pushes your match back:
[Cc][Aa][Ss][Ee][^a-zA-Z\\d]*(\\d\\d*)[^a-zA-Z\\d]*
Otherwise, make the leading repetition non-greedy;
(.*?)[Cc][Aa][Ss][Ee][^a-zA-Z\\d]*(\\d\\d*)[^a-zA-Z\\d]*(.*)
By the way, you can simplify this, using case-insensitive matching. If you cannot activate it in your tool, you can do it inline in the regex:
(?i)(.*?)case[^a-z\\d]*(\\d+)[^a-z\\d]*(.*)
Note that I also simplified the number. +
means 1 or more occurrence.