Search code examples
scalapattern-matchingstring-interpolation

Pattern matching using string interpolation


In the following example using Scala 2.13.3 the 1st pattern matches, but the 2nd does not.

The 3rd pattern again matches, while the 4th does not (note that separator in the 4th match expression is enclosed in backticks, thus referencing the value defined before).

  trait A
  case object A extends A {
    def unapply(a: String): Option[A] = if (a == "my_a") Some(A) else None
  }
  trait B
  case object B extends B {
    def unapply(b: String): Option[B] = if (b == "myB") Some(B) else None
  }

  val match1 = "myB_my_a" match {
    case s"${B(b)}_${A(a)}" => Some((a,b))
    case _ => None
  } // Some((A,B))

  val match2 = "my_a_myB" match {
    case s"${A(a)}_${B(b)}" => Some((a,b))
    case _ => None
  } // None

  val match3 = "my_a__myB" match {
    case s"${A(a)}__${B(b)}" => Some((a,b))
    case _ => None
  } // Some((A,B))

  val separator = "__"
  val match4 = s"my_a${separator}myB" match {
    case s"${A(a)}${`separator`}${B(b)}" => Some((a,b))
    case _ => None
  } // None

Why do only the 1st and the 3rd pattern match?

Is there a good matching alternative to the 2nd pattern that a) is using the unapply methods of A and B and where b) we don't know what strings these methods are accepting?

Edit 1: Added case object B and another matching example.

Edit 2: Another example to illustrate jwvh's answer:

  val (a, b) = ("my_a", "myB")
  val match5 = s"${a}_${b}" match {
    case s"${`a`}_${`b`}" => Some((a, b)) // does not match
    case s"${x}_${y}" => Some((x, y)) // matches: Some(("my", "a_myB"))
  }

Edit 3: To illustrate how, unlike case class construction and extraction with apply and unapply, the construction and extraction of strings using similar string interpolation are not (and cannot be) inverse functions:

  case class AB(a: String, b: String)
  val id = (AB.apply _ tupled) andThen AB.unapply andThen (_.get)
  val compare = id(("my_a", "myB")) == ("my_a", "myB") // true

  val construct: (String, String) => String = (a,b) => s"${a}_${b}"
  val extract: String => (String, String) = { case s"${a}_${b}" => (a,b) }
  val id2 = (construct tupled) andThen extract
  val compare2 = id2(("my_a","myB")) == ("my_a","myB") // false

Solution

  • As your own test (mentioned in the comments) demonstrates, the interpolator recognizes that the match pattern "${A(a)}_${B(b)}" is made up of 2 parts separated by an underscore _. So a best-guess effort is made to split the target string accordingly.

    The 1st part, "my", is sent to the A.unapply() where it fails. The 2nd part, "a_myB", is not even attempted.

    Something similar happens in match4. The pattern "${A(a)}${'separator'}${B(b)}" has 3 dollar signs and thus 3 parts. But, without any explicit characters to anchor the pattern, the target string is split into these 3 parts.

    1. ""
    2. ""
    3. "my_a__myB"

    Again, the 1st part fails the unapply() and the other parts are never attempted.


    While your Edit 3 code is technically correct, I don't find it terribly convincing. You've simply demonstrated that (String,String)=>AB(String,String)=>(String,String) is (or can be) a lossless data transition. The same cannot be said of (String,String)=>String which introduces some ambiguity, i.e. the loss of information sufficient to guarantee restoration of the original data. That loss is inherent in the transformation itself, not the tools (interpolation) used to achieve it.

    The fact that case class and String interpolation both use apply()/unapply() under the hood strikes me as inconsequential.