Search code examples
regexscalacase-insensitivecapturing-group

Why do I get a MatchError when trying to grab this capturing group?


I've got a regex question. Here's a regex to grab the id out of a url - case insensitive:

scala> val idRegex = """(?i)images\/(.*)\.jpg""".r
idRegex: scala.util.matching.Regex = (?i)images\/(.*)\.jpg

It matches my subject:

scala> val slidephotoId = idRegex.findFirstIn("/xml/deliverables/images/23044.jpg")
slidephotoId: Option[String] = Some(images/23044.jpg)

But when I use it as an extractor I get a match error:

scala> val idRegex(id) = "/xml/deliverables/images/23044.jpg"
scala.MatchError:/xml/deliverables/images/23044.jpg (of class java.lang.String)
  ... 43 elided

What am I doing wrong there?


Solution

  • Regular expressions in Scala are anchored by default (meaning - they must match the entire input)- if you make your regex unanchored - this would work:

    scala> val idRegex = """(?i)images\/(.*)\.jpg""".r.unanchored
    idRegex: scala.util.matching.UnanchoredRegex = (?i)images\/(.*)\.jpg
    
    scala> val idRegex(id) = "/xml/deliverables/images/23044.jpg"
    id: String = 23044
    

    Another option, of course, is to change the regex so that it accounts for the entire input, e.g.:

    scala> val idRegex = """(?i).+images\/(.*)\.jpg""".r
    idRegex: scala.util.matching.Regex = (?i).+images\/(.*)\.jpg
    
    scala> val idRegex(id) = "/xml/deliverables/images/23044.jpg"
    id: String = 23044
    

    As for the findFirstIn method - obviously it returns the right result regardless of the regular expression being anchored or not - by definition, it can scan the input looking for a match, and doesn't require the entire input to match.