I run the following command:
spark-shell --packages datastax:spark-cassandra-connector:1.6.0-s_2.10
Then I stop the context with:
sc.stop
Then I run this code in the REPL:
val conf = new org.apache.spark.SparkConf(true).set("spark.cassandra.connection.host", "127.0.0.1")
val sc = new org.apache.spark.SparkContext(conf)
val sqlContext = new org.apache.spark.sql.SQLContext(sc)
val cc = new org.apache.spark.sql.cassandra.CassandraSQLContext(sc)
cc.setKeyspace("ksp")
cc.sql("SELECT * FROM continents").registerTempTable("conts")
val allContinents = sqlContext.sql("SELECT * FROM conts").collect
And I get:
org.apache.spark.sql.AnalysisException: Table not found: conts;
The keyspace ksp
and table continents
are defined in Cassandra, so I suspect the error isn't from that side.
(Spark 1.6.0,1.6.1)
Because you use different context for creating dataframe and execute SQL.
val conf = new
org.apache.spark.SparkConf(true).set("spark.cassandra.connection.host", "127.0.0.1")
val sc = new org.apache.spark.SparkContext(conf)
val sqlContext = new org.apache.spark.sql.SQLContext(sc)
val cc = new org.apache.spark.sql.cassandra.CassandraSQLContext(sc)
cc.setKeyspace("ksp")
cc.sql("SELECT * FROM continents").registerTempTable("conts")
// use cc instead of sqlContext
val allContinents = cc.sql("SELECT * FROM conts").collect