Search code examples
scalahiveudf

Scala: How to simplify nested pattern matching statements


I am writing a Hive UDF in Scala (because I want to learn scala). To do this, I have to override three functions: evaluate, initialize and getDisplayString.

In the initialize function I have to:

  • Receive an array of ObjectInspector and return an ObjectInspector
  • Check if the array is null
  • Check if the array has the correct size
  • Check if the array contains the object of the correct type

To do this, I am using pattern matching and came up with the following function:

  override def initialize(genericInspectors: Array[ObjectInspector]): ObjectInspector = genericInspectors match {
    case null => throw new UDFArgumentException(functionNameString + ": ObjectInspector is null!")
    case _ if genericInspectors.length != 1 => throw new UDFArgumentException(functionNameString + ": requires exactly one argument.")
    case _ => {
      listInspector = genericInspectors(0) match {
        case concreteInspector: ListObjectInspector => concreteInspector
        case _ => throw new UDFArgumentException(functionNameString + ": requires an input array.")
     }
      PrimitiveObjectInspectorFactory.getPrimitiveWritableObjectInspector(listInspector.getListElementObjectInspector.asInstanceOf[PrimitiveObjectInspector].getPrimitiveCategory)
    }
  }

Nevertheless, I have the impression that the function could be made more legible and, in general, prettier since I don't like to have code with too many levels of indentation.

Is there an idiomatic Scala way to improve the code above?


Solution

  • It's typical for patterns to include other patterns. The type of x here is String.

    scala> val xs: Array[Any] = Array("x")
    xs: Array[Any] = Array(x)
    
    scala> xs match {
         | case null => ???
         | case Array(x: String) => x
         | case _ => ???
         | }
    res0: String = x
    

    The idiom for "any number of args" is "sequence pattern", which matches arbitrary args:

    scala> val xs: Array[Any] = Array("x")
    xs: Array[Any] = Array(x)
    
    scala> xs match { case Array(x: String) => x case Array(_*) => ??? }
    res2: String = x
    
    scala> val xs: Array[Any] = Array(42)
    xs: Array[Any] = Array(42)
    
    scala> xs match { case Array(x: String) => x case Array(_*) => ??? }
    scala.NotImplementedError: an implementation is missing
      at scala.Predef$.$qmark$qmark$qmark(Predef.scala:230)
      ... 32 elided
    
    scala> Array("x","y") match { case Array(x: String) => x case Array(_*) => ??? }
    scala.NotImplementedError: an implementation is missing
      at scala.Predef$.$qmark$qmark$qmark(Predef.scala:230)
      ... 32 elided
    

    This answer should not be construed as advocating matching your way back to type safety.