Randomly getting java.lang.ClassCastException in snappy job

Snappy job written in Scala aborts with exception: java.lang.ClassCastException: com.....$Class1 cannot be cast to com.....$Class1.

Class1 is custom class that is stored in RDD. Interesting thing is this error is thrown while casting same class. So far, no patterns are found.

In the job, we fetch data from hbase, enrich data with analytical metadata using Dataframes and push it to a table in SnappyData. We are using Snappydata 1.2.0.1.

Not sure why is this happening.

Below is Stack Trace: Job aborted due to stage failure: Task 76 in stage 42.0 failed 4 times, most recent failure: Lost task 76.3 in stage 42.0 (TID 3550, HostName, executor XX.XX.x.xxx(10360):7872): java.lang.ClassCastException: cannot be cast to at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(generated.java:86) at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) at org.apache.spark.sql.execution.WholeStageCodegenRDD$$anon$2.hasNext(WholeStageCodegenExec.scala:571) at org.apache.spark.sql.execution.WholeStageCodegenRDD$$anon$1.hasNext(WholeStageCodegenExec.scala:514) at org.apache.spark.sql.execution.columnar.InMemoryRelation$$anonfun$1$$anon$1.hasNext(InMemoryRelation.scala:132) at org.apache.spark.storage.memory.MemoryStore.putIteratorAsValues(MemoryStore.scala:233) at org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1006) at org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:997) at org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:936) at org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:997) at org.apache.spark.storage.BlockManager.getOrElseUpdate(BlockManager.scala:700) at org.apache.spark.rdd.RDD.getOrCompute(RDD.scala:335) at org.apache.spark.rdd.RDD.iterator(RDD.scala:286) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:41) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324) at org.apache.spark.rdd.RDD.iterator(RDD.scala:288) at org.apache.spark.sql.execution.WholeStageCodegenRDD.computeInternal(WholeStageCodegenExec.scala:557) at org.apache.spark.sql.execution.WholeStageCodegenRDD$$anon$1.(WholeStageCodegenExec.scala:504) at org.apache.spark.sql.execution.WholeStageCodegenRDD.compute(WholeStageCodegenExec.scala:503) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324) at org.apache.spark.rdd.RDD.iterator(RDD.scala:288) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:41) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324) at org.apache.spark.rdd.RDD.iterator(RDD.scala:288) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:103) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:58) at org.apache.spark.scheduler.Task.run(Task.scala:126) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:326) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at org.apache.spark.executor.SnappyExecutor$$anon$2$$anon$3.run(SnappyExecutor.scala:57) at java.lang.Thread.run(Thread.java:748)

Solution

Classes are not unique by name. They're unique by name + classloader.

ClassCastException of the kind you're seeing happens when you pass data between parts of the app where one or both parts are loaded in a separate classloader.

You might need to clean up your classpath, you might need to resolve the classes from the same classloader, or you might have to serialize the data (especially if you have features that rely on reloading code at runtime).