Adding spark-csv dependency in Zeppelin is creating a network error.
I went to the Spark interpreter in Zeppelin and added the Spark-csv dependency. com.databricks:spark-csv_2.10:1.2.0
. I also added it in the argument option.
I restarted Zeppelin and ran the following command :
import org.apache.spark.sql.SQLContext
val sqlContext = new SQLContext(sc)
val df = sqlContext.read
.format("com.databricks.spark.csv")
.option("header", "true") // Use first line of all files as header
.option("inferSchema", "true") // Automatically infer data types
.load("https://github.com/databricks/spark-csv/raw/master/src/test/resources/cars.csv")
df.printSchema()
Am I adding the dependency correctly?
UPDATE
Tried changing the library to com.databricks:spark-csv_2.11:jar:1.6.0
and got the following :
Error setting properties for interpreter 'spark.spark': Could not find artifact com.databricks:spark-csv_2.11:jar:1.6.0 in central (http://repo1.maven.org/maven2/)
It looks like you used pretty old library version, in addition built for scala 2.10 (where your spark seems to be 2.11).
Change the package to com.databricks:spark-csv_2.11:1.5.0
and it should work.