apache-spark spark-streaming hadoop2 apache-zeppelin

Use Apache Zeppelin with existing Spark Cluster

I want to install Zeppelin to use my existing Spark cluster. I used the following way:

Spark Master (Spark 1.5.0 for Hadoop 2.4):
- Zeppelin 0.5.5
Spark Slave

I downladed the Zeppelin v0.5.5 and installed it via:

mvn clean package -Pspark-1.5 -Dspark.version=1.5.0 -Dhadoop.version=2.4.0 -Phadoop-2.4 -DskipTests

I saw, that the local[*] master setting works also without my Spark Cluster (notebook also runnable when shutted down the Spark cluster).

My problem: When I want to use my Spark Cluster for a Streaming application, it seems not to work correctly. My SQL-Table is empty when I use spark://my_server:7077 as master - in local mode everything works fine!

See also my other question which describes the problem: Apache Zeppelin & Spark Streaming: Twitter Example only works local

Did I something wrong

on installation via "mvn clean packge"?
on setting the master url?
Spark and/or Hadoop version (any limitations???)
Do I have to set something special in zeppelin-env.sh file (is actually back on defaults)???

Solution

The problem was caused by a missing library dependency! So before searching around too long, first check the dependencies, whether one is missing!

%dep
z.reset
z.load("org.apache.spark:spark-streaming-twitter_2.10:1.5.1")