I am testing out Ignite Spark Integration and running into issues when Spark is to perform operations on Ignite RDD. Spark master and the worker are up and running along with 2 node Ignite cluster. Getting InvalidClassException Please suggest what can be done to overcome this.
Spark version 2.3.0. Ignite version 2.10.0
Spark shell is started with
spark-2.3.0-bin-hadoop2.7]# ./bin/spark-shell --packages org.apache.ignite:ignite-spark:1.8.0,org.apache.ignite:ignite-spring:1.8.0 --conf spark.driver.extraClassPath=/opt/apache-ignite-2.10.0-bin/libs/*:/opt/apache-ignite-2.10.0-bin/libs/optional/ignite-spark/*:/opt/apache-ignite-2.10.0-bin/libs/optional/ignite-log4j/*:/opt/apache-ignite-2.10.0-bin/libs/optional/ignite-yarn/*:/opt/apache-ignite-2.10.0-bin/libs/ignite-spring/* --jars /opt/apache-ignite-2.10.0-bin/libs/ignite-core-2.10.0.jar,/opt/apache-ignite-2.10.0-bin/libs/ignite-spark-2.10.0.jar,/opt/apache-ignite-2.10.0-bin/libs/cache-api-1.0.0.jar,/opt/apache-ignite-2.10.0-bin/libs/ignite-log4j-2.10.0.jar,/opt/apache-ignite-2.10.0-bin/libs/log4j-1.2.17.jar,/opt/apache-ignite-2.10.0-bin/libs/ignite-spring/ignite-spring-2.10.0.jar --master spark://xx.xx.xx.xx:7077
Execption attached below:
scala> val ic = new IgniteContext(sc, "/opt/apache-ignite-2.10.0-bin/examples/config/spark/example-shared-rdd.xml")
2021-08-18 11:19:35 WARN IgniteKernal:576 - Please set system property '-Djava.net.preferIPv4Stack=true' to avoid possible problems in mixed environments.
2021-08-18 11:19:35 WARN TcpCommunicationSpi:576 - Message queue limit is set to 0 which may lead to potential OOMEs when running cache operations in FULL_ASYNC or PRIMARY_SYNC modes due to message queues growth on sender and receiver sides.
2021-08-18 11:19:35 WARN NoopCheckpointSpi:576 - Checkpoints are disabled (to enable configure any GridCheckpointSpi implementation)
2021-08-18 11:19:35 WARN GridCollisionManager:576 - Collision resolution is disabled (all jobs will be activated upon arrival).
2021-08-18 11:19:35 WARN IgniteH2Indexing:576 - Serialization of Java objects in H2 was enabled.
2021-08-18 11:19:37 WARN GridClusterStateProcessor:576 - Received state finish message with unexpected ID: ChangeGlobalStateFinishMessage [id=36bcec75b71-46c2fd15-4c79-4b79-a1d5-998b63dc7dd6, reqId=bfecc55e-bbe0-4451-95ef-3f8020b7b97e, state=ACTIVE, transitionRes=true]
ic: org.apache.ignite.spark.IgniteContext = org.apache.ignite.spark.IgniteContext@5af7e2f
scala> val sharedRDD = ic.fromCache[Integer, Integer]("partitioned")
sharedRDD: org.apache.ignite.spark.IgniteRDD[Integer,Integer] = IgniteRDD[0] at RDD at IgniteAbstractRDD.scala:32
scala> sharedRDD.filter(_._2 < 10).collect()
[Stage 0:> (0 + 4) / 1024]
2021-08-18 11:20:08 WARN TaskSetManager:66 - Lost task 3.0 in stage 0.0 (TID 3, 172.16.8.181, executor 0): java.io.InvalidClassException: org.apache.ignite.spark.impl.IgnitePartition; local class incompatible: stream classdesc serialVersionUID = -2372563944236654956, local class serialVersionUID = -7759805985368763313
at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:699)
at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:2001)
at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1848)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2158)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1665)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2403)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2327)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2185)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1665)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:501)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:459)
at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:75)
at org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:114)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:313)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2021-08-18 11:20:08 ERROR TaskSetManager:70 - Task 0 in stage 0.0 failed 4 times; aborting job
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage 0.0 (TID 25, 172.16.8.181, executor 0): java.io.InvalidClassException: org.apache.ignite.spark.impl.IgnitePartition; local class incompatible: stream classdesc serialVersionUID = -2372563944236654956, local class serialVersionUID = -7759805985368763313
at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:699)
at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:2001)
at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1848)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2158)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1665)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2403)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2327)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2185)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1665)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:501)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:459)
at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:75)
at org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:114)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:313)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Driver stacktrace:
at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1599)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1587)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1586)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1586)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:831)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:831)
at scala.Option.foreach(Option.scala:257)
at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:831)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1820)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1769)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1758)
at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:642)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2027)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2048)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2067)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2092)
at org.apache.spark.rdd.RDD$$anonfun$collect$1.apply(RDD.scala:939)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:363)
at org.apache.spark.rdd.RDD.collect(RDD.scala:938)
... 53 elided
Caused by: java.io.InvalidClassException: org.apache.ignite.spark.impl.IgnitePartition; local class incompatible: stream classdesc serialVersionUID = -2372563944236654956, local class serialVersionUID = -7759805985368763313
at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:699)
at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:2001)
at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1848)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2158)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1665)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2403)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2327)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2185)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1665)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:501)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:459)
at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:75)
at org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:114)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:313)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
<beans xmlns="http://www.springframework.org/schema/beans"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="
http://www.springframework.org/schema/beans
http://www.springframework.org/schema/beans/spring-beans.xsd">
<bean id="ignite.cfg" class="org.apache.ignite.configuration.IgniteConfiguration">
<property name="cacheConfiguration">
<!-- SharedRDD cache example configuration (Atomic mode). -->
<bean class="org.apache.ignite.configuration.CacheConfiguration">
<!-- Set a cache name. -->
<property name="name" value="sharedRDD"/>
<!-- Set a cache mode. -->
<property name="cacheMode" value="PARTITIONED"/>
<!-- Index Integer pairs used in the example. -->
<property name="indexedTypes">
<list>
<value>java.lang.Integer</value>
<value>java.lang.Integer</value>
</list>
</property>
<!-- Set atomicity mode. -->
<property name="atomicityMode" value="ATOMIC"/>
<!-- Configure a number of backups. -->
<property name="backups" value="1"/>
</bean>
</property>
<property name="workDirectory" value="/opt/apache-ignite-2.10.0-bin/ignitework"/>
<property name="consistentId" value="Spark test1"/>
<!-- Explicitly configure TCP discovery SPI to provide list of initial nodes. -->
<property name="discoverySpi">
<bean class="org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi">
<property name="ipFinder">
<!--
Ignite provides several options for automatic discovery that can be used
instead os static IP based discovery. For information on all options refer
to our documentation: http://apacheignite.readme.io/docs/cluster-config
-->
<!-- Uncomment static IP finder to enable static-based discovery of initial nodes. -->
<!--<bean class="org.apache.ignite.spi.discovery.tcp.ipfinder.vm.TcpDiscoveryVmIpFinder">-->
<bean class="org.apache.ignite.spi.discovery.tcp.ipfinder.multicast.TcpDiscoveryMulticastIpFinder">
<property name="addresses">
<list>
<!-- In distributed environment, replace with actual host IP address. -->
<value>IP1:47500..47509</value>
<value>IP2:47500..47509</value>
</list>
</property>
</bean>
</property>
</bean>
</property>
</bean>
</beans>
Kindly help with a solution to this issue.
Thanks in advance.
The issue got fixed by adding the correct package version while starting the sparkshell.
--packages org.apache.ignite:ignite-spark:2.10.0,org.apache.ignite:ignite-spring:2.10.0