Search code examples
regexstringscalacapturing-group

Scala regex Named Capturing Groups


In scala.util.matching.Regex trait MatchData I see that there support for groupnames , I thought that this was related to (Regex Named Capturing Groups)

But since Java does not support groupnames until version 7 as I understand it (ref), Scala version 2.8.0 (Java HotSpot(TM) 64-Bit Server VM, Java 1.6. gives me this exception:

scala> val pattern = """(?<login>\w+) (?<id>\d+)""".r
java.util.regex.PatternSyntaxException: Look-behind group does not have an obvio
us maximum length near index 11
(?<login>\w+) (?<id>\d+)
           ^
        at java.util.regex.Pattern.error(Pattern.java:1713)
        at java.util.regex.Pattern.group0(Pattern.java:2488)
        at java.util.regex.Pattern.sequence(Pattern.java:1806)
        at java.util.regex.Pattern.expr(Pattern.java:1752)
        at java.util.regex.Pattern.compile(Pattern.java:1460)

So the question is Named Capturing Groups supported in Scala? If so any examples out there?


Solution

  • I'm afraid that Scala's named groups aren't defined the same way. It's nothing but a post-processing alias to unnamed (i.e. just numbered) groups in the original pattern.

    Here's an example:

    import scala.util.matching.Regex
    
    object Main {
       def main(args: Array[String]) {
          val pattern = new Regex("""(\w*) (\w*)""", "firstName", "lastName");
          val result = pattern.findFirstMatchIn("James Bond").get;
          println(result.group("lastName") + ", " + result.group("firstName"));
       }
    }
    

    This prints (as seen on ideone.com):

    Bond, James
    

    What happens here is that in the constructor for the Regex, we provide the aliases for group 1, 2, etc. Then we can refer to these groups by those names. These names are not intrinsic in the patterns themselves.