I am running Apache Spark 2.0.1 and Apache Zeppelin 0.6.2.
In Zeppelin, I have the following paragraph:
val df = sqlContext
.read
.format("org.apache.spark.sql.cassandra")
.options(Map( "table" -> "iot_data2", "keyspace" -> "iot" ))
.load()
import org.apache.spark.sql.functions.{avg,round}
val ts = $"updated_time".cast("long")
val interval = (round(ts / 3600L) * 3600.0).cast("timestamp").alias("time")
df.groupBy($"a", $"b", $"date_bucket", interval).avg("t").createOrReplaceTempView("iot_avg")
The next paragraph I am trying to plot a graph, but the value for avg("t") is always 0:
%sql
select time,avg("t") as avg_t from ble_temp_avg where a = '${a}' and b = '${b}' group by time order by time
I think I am missing something really obvious but I just don't know what it is as a new Spark and Zeppelin user.
This seems to work after I rewrite the paragraphs:
In the first paragraph:
val df = sqlContext
.read
.format("org.apache.spark.sql.cassandra")
.options(Map( "table" -> "iot_data2", "keyspace" -> "iot" ))
.load()
import org.apache.spark.sql.functions.{avg,round}
val ts = $"updated_time".cast("long")
val interval = (round(ts / 3600L) * 3600.0).cast("timestamp").alias("time")
df.select($"a", $"b", $"date_bucket", interval, $"t").createOrReplaceTempView("iot_avg")
In the second paragraph:
%sql
select time,avg(t) as avg_t from iot_avg where a = 'test1' and b = 'test2' group by time order by time