Search code examples
arraysscalaapache-spark

Convert case class constructor parameters to String Array in Scala


I have a case class as follows:
case class MHealthUser(acc_Chest_X: Double, acc_Chest_Y: Double, acc_Chest_Z: Double, activityLabel: Int)

These form the schema of a Spark DataFrame, which is why I'm using a case class. I simply want to map these to an Array[String] so I can use the ParamValidators.inArray(attributes) method in Spark. I use the following code to map the constructor parameters to an array using reflection:

val attributes: Array[String] = MHealthUser.getClass.getConstructors.map(a => a.toString)

but this simply gives me an array of length 1 whereas I want an array of length 4, with the contents of the array being the dataset schema which I've defined, as a string. Otherwise I'm using the hard-coded values of the dataset schema, which is obviously inelegant. In other words I want the output:

val attributes: Array[String] = Array("acc_Chest_X", "acc_Chest_Y", "acc_Chest_Z", "activityLabel")

I've been playing with this for a while and can't get it to work. Any ideas appreciated. Thanks!


Solution

  • I'd use ScalaReflection:

    import org.apache.spark.sql.catalyst.ScalaReflection
    import org.apache.spark.sql.types.StructType
    
    ScalaReflection.schemaFor[MHealthUser].dataType match {
      case s: StructType => s.fieldNames
      case _ => Array[String]()
    }
    

    Outside Spark see Scala. Get field names list from case class