Search code examples
scalapattern-matching

Scala pattern matching not working with Option[Seq[String]]


I am new to Scala(2.13.8) and working on code to use pattern matching to handle a value in different ways, code is very simply like below

def getOption(o: Option[Any]): Unit = {
  o match {
    case l: Some[List[String]] => handleListData(l)
    case _  => handleData(_)
  }
}

getOption(Some(3))
getOption(Some(Seq("5555")))

The result is handleListData() been invoked for both input. Can someone help on what's wrong in my code?


Solution

  • As sarveshseri mentioned in the comments, the problem here is caused by type erasure. When you compile this code, scalac issues a warning:

    [warn] /Users/tmoore/IdeaProjects/scala-scratch/src/main/scala/PatternMatch.scala:6:15: non-variable type argument List[String] in type pattern Some[List[String]] is unchecked since it is eliminated by erasure
    [warn]       case l: Some[List[String]] => handleListData(l)
    [warn]               ^
    

    This is because the values of type parameters are not available at runtime due to erasure, so this case is equivalent to:

          case l: Some[_] => handleListData(l.asInstanceOf[Some[List[String]]])
    

    This may fail at runtime due to an automatically-inserted cast in handleListData, depending on how it actually uses its argument.

    One thing you can do is take advantage of destructuring in the case pattern in order to do a runtime type check on the content of the Option:

          case Some(l: List[_]) => handleListData(l)
    

    This will work with a handleListData with a signature like this:

      def handleListData(l: List[_]): Unit
    

    Note that it unwraps the Option, which is most likely more useful than passing it along.

    However, it does not check that the List contains strings. To do so would require inspecting each item in the list. The alternative is an unsafe cast, made with the assumption that the list contains strings. This opens up the possibility of runtime exceptions later if the list elements are cast to strings, and are in fact some other type.

    This change also reveals a problem with the second case:

        case _  => handleData(_)
    

    This does not do what you probably think it does, and issues its own compiler warning:

    warn] /Users/tmoore/IdeaProjects/scala-scratch/src/main/scala/PatternMatch.scala:7:28: a pure expression does nothing in statement position
    [warn]       case _  => handleData(_)
    [warn]                            ^
    

    What does this mean? It's telling us that this operation has no effect. It does not invoke the handleData method with o as you might think. This is because the _ character has special meaning in Scala, and that meaning depends on the context where it's used.

    In the pattern match case _, it is a wildcard that means "match anything without binding the match to a variable". In the expression handleData(_) it is essentially shorthand for x => handleData(x). In other words, when this case is reached, it evaluates to a Function value that would invoke handleData when applied, and then discards that value without invoking it. The result is that any value of o that doesn't match the first case will have no effect, and handleData is never called.

    This can be solved by using o in the call:

          case _  => handleData(o)
    

    or by assigning a name to the match:

          case x => handleData(x)
    

    Returning to the original problem: how can you call handleListData only when the argument contains a List[String]? Since the type parameter is erased at runtime, this requires some other kind of runtime type information to differentiate it. A common approach is to define a custom algebraic data type instead of using Option:

    object PatternMatch {
      sealed trait Data
      case class StringListData(l: List[String]) extends Data
      case class OtherData(o: Any) extends Data
    
      def handle(o: Data): Unit = {
        o match {
          case StringListData(l) => handleListData(l)
          case x => handleData(x)
        }
      }
    
      def handleListData(l: List[String]): Unit = println(s"Handling string list data: $l")
      def handleData(value: Any): Unit = println(s"Handling data: $value")
    
      def main(args: Array[String]): Unit = {
        PatternMatch.handle(OtherData(3))
        PatternMatch.handle(StringListData(List("5555", "6666")))
        PatternMatch.handle(OtherData(List(7777, 8888)))
        PatternMatch.handle(OtherData(List("uh oh!")))
    
        /*
         * Output:
         * Handling data: OtherData(3)
         * Handling string list data: List(5555, 6666)
         * Handling data: OtherData(List(7777, 8888))
         * Handling data: OtherData(List(uh oh!))
         */
      }
    }
    

    Note that it's still possible here to create an instance of OtherData that actually contains a List[String], in which case handleData is called instead of handleListData. You would need to be careful not to do this when creating the Data passed to handle. This is the best you can do if you really need to handle Any in the default case. You can also extend this pattern with other special cases by creating new subtypes of Data, including a case object to handle the "empty" case, if needed (similar to None for Option):

    case object NoData extends Data
    // ...
    PatternMatch.handle(NoData) // prints: 'Handling data: NoData'