apache-spark cassandra apache-spark-sql spark-cassandra-connector

Spark SQL - registered temporary table not found

I run the following command:

spark-shell --packages datastax:spark-cassandra-connector:1.6.0-s_2.10

Then I stop the context with:

sc.stop

Then I run this code in the REPL:

val conf = new org.apache.spark.SparkConf(true).set("spark.cassandra.connection.host", "127.0.0.1")
val sc = new org.apache.spark.SparkContext(conf)
val sqlContext = new org.apache.spark.sql.SQLContext(sc)
val cc = new org.apache.spark.sql.cassandra.CassandraSQLContext(sc)

cc.setKeyspace("ksp")

cc.sql("SELECT * FROM continents").registerTempTable("conts")

val allContinents = sqlContext.sql("SELECT * FROM conts").collect

And I get:

org.apache.spark.sql.AnalysisException: Table not found: conts;

The keyspace ksp and table continents are defined in Cassandra, so I suspect the error isn't from that side.

(Spark 1.6.0,1.6.1)

Solution

Because you use different context for creating dataframe and execute SQL.

val conf = new 
org.apache.spark.SparkConf(true).set("spark.cassandra.connection.host", "127.0.0.1")
val sc = new org.apache.spark.SparkContext(conf)
val sqlContext = new org.apache.spark.sql.SQLContext(sc)
val cc = new org.apache.spark.sql.cassandra.CassandraSQLContext(sc)

cc.setKeyspace("ksp")

cc.sql("SELECT * FROM continents").registerTempTable("conts")

// use cc instead of sqlContext
val allContinents = cc.sql("SELECT * FROM conts").collect