The SparkConf
has the method registerKryoClasses
:
def registerKryoClasses(classes: Array[Class[_]]): SparkConf = { .. }
However it is not available/exposed in the RuntimeConfiguration
facade provided by the SparkSession.conf()
attribute
@transient lazy val conf: RuntimeConfig = new RuntimeConfig(sessionState.conf)
Here is more about the RuntimeConfiguration
:
/**
* Runtime configuration interface for Spark. To access this, use `SparkSession.conf`.
*
* Options set here are automatically propagated to the Hadoop configuration during I/O.
*
* @since 2.0.0
*/
@InterfaceStability.Stable
class RuntimeConfig private[sql](sqlConf: SQLConf = new SQLConf) {
There is a clear workaround for this when creating our own SparkSession
: we can invoke the set(key,value)
on the SparkConf
that is provided to the
val mysparkConf = SparkConf.set(someKey,someVal)
mysparkConf.registerKryoClasses(Array(classOf[Array[InternalRow]]))
SparkSession.builder.conf(mySparkConf)
And then one that is not so clear..
conf.registerKryoClasses(Array(classOf[scala.reflect.ClassTag$$anon$1]))
But when running the Spark shell
the sparkSession
/sparkContext
are already created. So then how can the non-runtime settings be put into effect?
The particular need here is :
sparkConf.registerKryoClasses(Array(classOf[org.apache.spark.sql.Row]))
When attempting to set that on the SqlConf
available to the spark
session object We get this exception:
scala> spark.conf.registerKryoClasses(Array(classOf[Row]))
error: value registerKryoClasses is not a member of org.apache.spark.sql.RuntimeConfig spark.conf.registerKryoClasses(Array(classOf[Row]))
So then how can the kryo serializers be registered in the spark-shell
?
The following is not an exact answer to [my own] question - but it seems to serve as a workaround for the present specific predicament:
implicit val generalRowEncoder: Encoder[Row] = org.apache.spark.sql.Encoders.kryo[Row]
Having this implicit in scope seems to register the classes with kryo directly on the SparkConf.