Search code examples
scalaapache-sparkkryo

Registring Kryo classes is not working


I have the following code :

val conf = new SparkConf().setAppName("MyApp")
val sc = new SparkContext(conf)
conf.set("spark.serializer", "org.apache.spark.serializer.KryoSerializer")
new conf.registerKryoClasses(new Class<?>[]{
        Class.forName("org.apache.hadoop.io.LongWritable"),
        Class.forName("org.apache.hadoop.io.Text")
    });

But I am bumping into the following error :

')' expected but '[' found.
[error]                 new conf.registerKryoClasses(new Class<?>[]{

How can I solve this problem ?


Solution

  • You're mixing Scala and Java. In Scala, you can define an Array[Class[_]] (instead of a Class<?>[]):

    val conf = new SparkConf()
                .setAppName("MyApp")
                .set("spark.serializer", "org.apache.spark.serializer.KryoSerializer")
                .registerKryoClasses(Array[Class[_]](
                  Class.forName("org.apache.hadoop.io.LongWritable"),
                  Class.forName("org.apache.hadoop.io.Text")
                ));
    
    val sc = new SparkContext(conf)
    

    We can even do a little better. In order not to get our classes wrong using string literals, we can actually utilize the classes and use classOf to get their class type:

    import org.apache.hadoop.io.LongWritable
    import org.apache.hadoop.io.Text
    
    val conf = new SparkConf()
                .setAppName("MyApp")
                .set("spark.serializer", "org.apache.spark.serializer.KryoSerializer")
                .registerKryoClasses(Array[Class[_]](
                  classOf[LongWritable],
                  classOf[Test],
                ))
    
    val sc = new SparkContext(conf)