I'm using Apache Spark 2.1.0 and Cassandra 3.0.14. In my code I want to create a connection between Spark and Cassandra:
...
SparkSession sparkSession = SparkSession.builder()
.appName(appName)
.config("spark.cassandra.connection.host", "localhost")
.config("spark.cassandra.connection.port", 9042)
.getOrCreate();
CassandraConnector cassandraConnector = CassandraConnector
.apply(sparkSession.sparkContext().getConf());
Session session = cassandraConnector.openSession();
ResultSet rs = session.execute("select * from myDB.myTable");
...
When I run the code locally in eclipse everything works fine, but when I run the jar-file on my local spark server I get
Exception in thread "main" java.lang.NullPointerException
The method which causes this error is
cassandraConnector.openSession();
This is my pom.xml:
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>xign_analysis</groupId>
<artifactId>xign_analysis_jar_archive</artifactId>
<version>0.0.1-SNAPSHOT</version>
<properties>
<maven.compiler.target>1.8</maven.compiler.target>
<maven.compiler.source>1.8</maven.compiler.source>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
</properties>
<build>
</build>
<dependencies>
<!-- https://mvnrepository.com/artifact/org.apache.spark/spark-core_2.10 -->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.10</artifactId>
<version>2.1.1</version>
<scope>compile</scope>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.spark/spark-sql_2.10 -->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.10</artifactId>
<version>2.1.1</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-mllib_2.10</artifactId>
<version>2.1.1</version>
<scope>compile</scope>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.cassandra/cassandra-all -->
<dependency>
<groupId>org.apache.cassandra</groupId>
<artifactId>cassandra-all</artifactId>
<version>3.11.0</version>
<exclusions>
<exclusion>
<groupId>org.slf4j</groupId>
<artifactId>log4j-over-slf4j</artifactId>
</exclusion>
</exclusions>
</dependency>
<!-- https://mvnrepository.com/artifact/com.datastax.spark/spark-cassandra-connector_2.10 -->
<dependency>
<groupId>com.datastax.spark</groupId>
<artifactId>spark-cassandra-connector_2.10</artifactId>
<version>2.0.5</version>
</dependency>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>4.12</version>
</dependency>
</dependencies>
</project>
Im using Macbook with El Capitan (10.11.06 ). My Spark Master, Spark Worker and Cassandra server are running fine. I have no idea how to fix this issue.
I found the solution. In the spark/jars dir there is an older version of Guava: Google Core Libraries For Java. I replaced this older version (v. 14.0.1) with the newest one (v. 23.0) and everything works fine.