Search code examples
regexapex-code

Regex to match only till first occurence of class match


I'm looking to construct a regex which will help me identify the first occurrence of a match.

My current regex is "(.*)[Cc][Aa][Ss][Ee][^a-zA-Z\\d]*(\\d\\d*)[^a-zA-Z\\d]*(.*)"

What I am trying to do is to find if the input string contains the word "case" (case insensitive), followed by any number of special characters, followed by a number; I want to retrieve 3 parts of the text. Say my input string is "RE: FW: case:-1234: there is some description" Using this regex, I am able to retrieve, "RE: FW: ", "1234", "there is some description".

This is fine, but if my input string is "RE: FW: case:-1234: This is in reference to case 789 reopening" Then my regex returns, "RE: FW: case:-1234: This is in reference to", "789", "reopening".

What I would like to get is "RE: FW: ", "1234", "This is in reference to case 789 reopening".

I am a newbie with regex, so any help is much appreciated.

Note: I am working on a java based tool, so java compatible regex would be nice.


Solution

  • Does your regex have to match the entire string (i.e. does it use matches)? If not (or if you can choose to use find instead) simply remove the (.*), because that's what pushes your match back:

    [Cc][Aa][Ss][Ee][^a-zA-Z\\d]*(\\d\\d*)[^a-zA-Z\\d]*
    

    Otherwise, make the leading repetition non-greedy;

    (.*?)[Cc][Aa][Ss][Ee][^a-zA-Z\\d]*(\\d\\d*)[^a-zA-Z\\d]*(.*)
    

    By the way, you can simplify this, using case-insensitive matching. If you cannot activate it in your tool, you can do it inline in the regex:

    (?i)(.*?)case[^a-z\\d]*(\\d+)[^a-z\\d]*(.*)
    

    Note that I also simplified the number. + means 1 or more occurrence.