I have searched through other answers related to this question and they have not helped.
I am trying to add a column to a dataframe. This column will have a datatype of Seq[CaseClass]
. At first I thought it might be that spark doesn't support collection type columns but this isn't the case.
Here is an example of the code I am trying to run. I just want to add an empty Seq[CaseClass] to each row that I can append to later on.
case class Employee(name: String)
val emptyEmployees: Seq[Employee] = Seq()
df.withColumn("Employees", lit(emptyEmployees))
But then I get this error being thrown at the line with the withColumn
Unsupported literal type class scala.collection.immutable.Nil$ List()
java.lang.RuntimeException: Unsupported literal type classscala.collection.immutable.Nil$ List()
If you are using spark 2.2+, then just change lit()
to typedLit()
, according to this answer.
case class Employee(name: String)
val emptyEmployees: Seq[Employee] = Seq()
val df = spark.createDataset(Seq("foo")).toDF("foo")
df.withColumn("Employees", typedLit(emptyEmployees)).show()
shows us:
+---+---------+
|foo|Employees|
+---+---------+
|foo| []|
+---+---------+
Update
For 2.1, the linked answer above for that version works by converting your lit(Array)
into an array()
of lit()
s (with some magic scala syntax). In your case, this will work because the array is empty.
def asLitArray[T](xs: Seq[T]) = array(xs map lit: _*)
case class Employee(name: String)
val emptyEmployees: Seq[Employee] = Seq()
val df = spark.createDataset(Seq("foo")).toDF("foo")
df.withColumn("Employees", asLitArray(emptyEmployees)).show()
Which has the same result:
+---+---------+
|foo|Employees|
+---+---------+
|foo| []|
+---+---------+
To actually have something in your Seq
would require a slightly different function.