I have a dataframe with a column of vector type as a result from onehot encoder. Let's name the column Vector.
With a case class Example(vector: WhichType), I want to map the dataframe to a Dataset:
val ds = dataframe.as[Example]
Question is: Which type should the property 'vector' in the case class have.
I get an error message:
need an array field but got structtype:tinyint,size:int,indices:array<int,values:array>;
If you're using Spark ML, then you can use the Vector type imported below:
import org.apache.spark.ml.linalg.Vector
case class Example(vector: Vector)