Search code examples
regexmusic-notation

Music Chord part splitting Regex


This is a follow-up question to this one: Regex for matching a music Chord, asked by me.

Now that I have a regex to know whether a String representation of a chord is valid or not (previous question), how can I effectively get the three different parts of the chord (the root note, accidentals and chord type) into seperate variables?

I could do simple string manipulation, but I guess that it would be easier to build on the previous code and use regex for that, or am I am wrong?

Here is the updated code from the aforementioned question:

public static void regex(String chord) {                
    String notes = "^[CDEFGAB]";
    String accidentals = "(#|##|b|bb)?";
    String chords = "(maj7|maj|min7|min|sus2)";
    String regex = notes + accidentals + chords; 
    Pattern pattern = Pattern.compile(regex);
    Matcher matcher = pattern.matcher(chord);
    System.out.println("regex is " + regex);
    if (matcher.find()) {
        int i = matcher.start();
        int j = matcher.end();
        System.out.println("i:" + i + " j:" + j);           
    }
    else {
        System.out.println("no match!");
    }
}

Thanks.


Solution

  • Enclosing something with parentheses (except in cases with special meaning) creates a capturing group, or subpattern.

    You already have accidentals and chords grouped as subpatterns like that, but you need to add parentheses to notes to capture that as a subpattern too.

    String notes = "^([CDEFGAB])";
    String accidentals = "(#|##|b|bb)?";
    String chords = "(maj7|maj|min7|min|sus2)";
    

    By convention, the string that is matched by the entire pattern is group 0, then every subpattern is captured as group 1, group 2, and so on.

    I'm not a Java guy, but after reading the docs it looks like you would access your subpattern matches using .group():

    String note = matcher.group(1);
    String acci = matcher.group(2);
    String chor = matcher.group(3);
    

    Edit:

    Originally, I suggested String accidentals = "((?:#|##|b|bb)?)";, because I was worried that the second subpattern being optional would have caused a group numbering problem if no match existed for it. However, a little testing suggests that even without wrapping it in a non-capturing grouping (?: ) like that, group 2 is always present but empty if there was no match. (Empty string in group 2 was the desired effect anyway.) So, it seems that ... = "(#|##|b|bb)?"; probably would suffice after all.