Checking the repo on github I see cassandraFormat
here. My import statement is not throwing an exception:
import org.apache.spark.sql.cassandra._
df.write
.cassandraFormat("keyspace", "table")
.save()
<console>:34: error: value cassandraFormat is not a member of org.apache.spark.sql.DataFrameWriter[org.apache.spark.sql.Row]
cassandraFormat
is not available under df.write
but it is under spark.read
.
I am using Spark 2.1.1. And my spark-shell is invoked by:
spark-shell --master spark://10.0.0.115:7077 --packages com.databricks:spark-csv_2.11:1.5.0,datastax:spark-cassandra-connector:1.6.6-s_2.11
Edit:
I did realise that cassandraFormat
was basically an alias for .format().options()
. However a different error was returned:
df.write
.format("org.apache.spark.sql.cassandra")
.options(Map("table" -> "standard_feed", "keyspace" -> "testing"))
.save()
java.lang.AbstractMethodError: org.apache.spark.sql.cassandra.DefaultSource.createRelation
(Lorg/apache/spark/sql/SQLContext;Lorg/apache/spark/sql/SaveMode;Lscala/collection/
immutable/Map;Lorg/apache/spark/sql/Dataset;)Lorg/apache/spark/sql/sources/
BaseRelation;
at org.apache.spark.sql.execution.datasources.DataSource.write(DataSource.scala:518)
at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:215)
I'm not so sure about cassandraFormat
but saving a dataframe to a cassandra table can easiest be done in the following way:
df.write
.format("org.apache.spark.sql.cassandra")
.options(Map("table" -> "table_name", "keyspace" -> "keyspace_name"))
.save()
About the error message you received:
I think the error message you get is due to a version mismatch. You use spark 2.1.1 and cassandra-connector version 1.6.6. For spark 2.1.x you need to use version 2.0 of the cassandra-connector, see table here for a full list of the version compatibilities.