scala pattern-matching case-class infix-notation parser-combinators

How does this case class definition allow infix pattern matching?

I recently wrote a parser using scala's parser combinator library. I decided I was curious about the implementation, and went digging.

While reading through the code, I saw that ~ sequencing used a case class to hold the left and right values.

Attached is the following comment:

/** A wrapper over sequence of matches.
   *
   *  Given `p1: Parser[A]` and `p2: Parser[B]`, a parser composed with
   *  `p1 ~ p2` will have type `Parser[~[A, B]]`. The successful result
   *  of the parser can be extracted from this case class.
   *
   *  It also enables pattern matching, so something like this is possible:
   *
   *  {{{
   *  def concat(p1: Parser[String], p2: Parser[String]): Parser[String] =
   *    p1 ~ p2 ^^ { case a ~ b => a + b }
   *  }}}
   */
  case class ~[+a, +b](_1: a, _2: b) {
    override def toString = "("+ _1 +"~"+ _2 +")"
  }

Given that such code as mentioned is certainly possible, and that parsers defined using a ~ b can be extracted into values via { case a ~ b => ... }, how exactly does this un-application work? I am aware of the unapply method in scala, but none is provided here. Do case classes provide one by default (I think yes)? If so, how does this particular case class become case a ~ b and not case ~(a,b)? Is this a pattern that can be exploited by scala programmers?

This differs from objects with unapply in this question because no unapply method exists–or does it? Do case classes auto-magically receive unapply methods?

Solution

Do case classes provide [unapply] by default (I think yes)?

Your suspicions are correct. unapply() is one of the many things automatically supplied in a case class. You can verify this for yourself by doing the following:

Write a simple class definition in its own file.
Compile the file only through the "typer" phase and save the results. (Invoke scalac -Xshow-phases to see a description of all the compiler phases.)
Edit the file. Add the word case before the class definition.
Repeat step 2.
Compare the two saved results.

From a Bash shell it might look like this.

%%> cat srcfile.scala
class XYZ(arg :Int)
%%> scalac -Xprint:4 srcfile.scala > plainClass.phase4
%%> vi srcfile.scala  # add “case” 
%%> scalac -Xprint:4 srcfile.scala > caseClass.phase4
%%> diff plainClass.phase4 caseClass.phase4

There will be a lot of compiler noise to wade through, but you'll see that by simply adding case to your class the compiler generates a ton of extra code.

Some of the things to note:

case class instances

have Product and Serializable mixed in to the type
provide public access to the constructor parameters
have methods copy(), productArity, productElement(), and canEqual().
overrides (provides new code for) methods productPrefix, productIterator, hashCode(), toString(), and equals()

the companion object (created by the compiler)

has methods apply() and unapply()
overrides toString()

If so, how does this particular case class become case a ~ b and not case ~(a,b)?

This turns out to be a nice (if rather obscure) convenience offered by the language.

The unapply() call returns a Tuple that can be patterned to infix notation. Again, this is pretty easy to verify.

class XX(val c:Char, val n:Int)
object XX {
  def unapply(arg: XX): Option[(Char, Int)] = Some((arg.c,arg.n))
}

val a XX b = new XX('g', 9)
//a: Char = g
//b: Int = 9