Search code examples

Regular Expression capturing group with optional delimiter

Seemed like a simple problem, I need to extract a capturing group and optionally limit the group with a delimiting string.

In the below example, I provide a delimiting string of 'cd' and expect that it would return 'ab' in all of the cases: 'ab', 'abcd', and 'abcdefg'

Here is the code:

public static void main(String[] args) {
    String expected = "ab"; // Could be more or less than two characters
    String[] tests = {"ab", "abcd", "abcdefg"};
    Pattern pattern = Pattern.compile("(.*)cd?.*");

    for(String test : tests) {
        Matcher match = pattern.matcher(test);
        if(match.matches()) {
                System.out.println("Capture Group for test: " + test + " - " +;
            else System.err.println("Expected " + expected + " but captured " +;
        } else System.err.println("No match for " + test);

The output is:

    No match for ab
    Capture Group for test: abcd - ab
    Capture Group for test: abcdefg - ab

I thought that a lookahead might work, but I don't think that there is one that is optional (i.e. zero or more instances)


  • Try this:

    Pattern pattern = Pattern.compile("(.*?)(?:cd.*|$)");

    The .*? is non-greedy, and the rest of the regex either matches cd followed by anything, or the end of the string.