Search code examples
apache-sparkcassandradatastax-java-driverspark-cassandra-connector

What happens - NoSuchMethodError: com.datastax.driver.core.ResultSet.fetchMoreResults


cassandra-connector-assembly-2.0.0 built from github project.

with Scala 2.11.8, cassandra-driver-core-3.1.0

sc.cassandraTable("mykeyspace", "mytable").select("something").where("key=?", key).mapPartitions(par => {
    par.map({ row => (row.getString("something"), 1 ) })
})
.reduceByKey(_ + _).collect().foreach(println)

The same job works fine for reading less mass data

java.lang.NoSuchMethodError: com.datastax.driver.core.ResultSet.fetchMoreResults()Lshade/com/datastax/spark/connector/google/common/util/concurrent/ListenableFuture;
    at com.datastax.spark.connector.rdd.reader.PrefetchingResultSetIterator.maybePrefetch(PrefetchingResultSetIterator.scala:26)
    at com.datastax.spark.connector.rdd.reader.PrefetchingResultSetIterator.next(PrefetchingResultSetIterator.scala:39)
    at com.datastax.spark.connector.rdd.reader.PrefetchingResultSetIterator.next(PrefetchingResultSetIterator.scala:17)
    at scala.collection.Iterator$$anon$11.next(Iterator.scala:409)
    at scala.collection.Iterator$$anon$11.next(Iterator.scala:409)
    at scala.collection.Iterator$$anon$12.next(Iterator.scala:444)
    at com.datastax.spark.connector.util.CountingIterator.next(CountingIterator.scala:16)
    at scala.collection.Iterator$$anon$11.next(Iterator.scala:409)
    at org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:194)
    at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:63)
    at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:79)
    at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:47)
    at org.apache.spark.scheduler.Task.run(Task.scala:85)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)

Can any one suggest or point out to the issue, and a possible solution?


Solution

  • It is a conflict with the Cassandra driver-core that

    libraryDependencies += "com.datastax.spark" % "spark-cassandra-connector_2.11" % "2.0.0-M3"
    

    brings in.

    If you go into the ~/.ivy2/cache/com.datastax.spark/spark-cassandra-connector_2.11 you will find a file called ivy-2.0.0-M3.xml

    In that file the dependency is

    com.datastax.cassandra" name="cassandra-driver-core" rev="3.0.2" force="true" 
    

    Note that it is the 3.0.2 version of Cassandra driver core which gets overrun by the more recent one.

    It just so happens that the latest source on Github does not show a implementation for fetchMoreResults which is inherited from interface PagingIterable

    If you roll back the Git version to 3.0.x on Github, you'll find

     public ListenableFuture<ResultSet> fetchMoreResults();
    

    So it looks like the newest Cassandra core drivers were rushed out the door incomplete. Or I might be missing something. Hope this helps.

    tl;dr; Remove the latest driver and use the one embedded in the spark cassandra connector.