Search code examples
spring-bootapache-sparkcassandraspark-cassandra-connectorspring-data-cassandra

How to connect to multiple Cassandra in different dc


I'm setting up an application in which i am using spark session to read data from Cassandra. I am able to read the data from Cassandra if i am passing one Cassandra node from a dc. But how can i connect to 3 different Cassandra nodes which belong to 3 different dc in spark session.

Here the code which i am using:

spark session

spark = SparkSession.builder().appName("SparkCassandraApp")
                .config("spark.cassandra.connection.host", cassandraContactPoints)
                .config("spark.cassandra.connection.port", cassandraPort)
                .config("spark.cassandra.auth.username", userName).config("spark.cassandra.auth.password", password)
                .config("spark.dynamicAllocation.enabled", "false").config("spark.shuffle.service.enabled", "false")
                .master("local[4]").getOrCreate();

property file :

spring.data.cassandra.contact-points=cassandra1ofdc1, cassandra2ofdc2, cassandra3ofdc3
spring.data.cassandra.port=9042

when i try the above scenario i am getting the following exception: Caused by:

java.lang.IllegalArgumentException: requirement failed: Contact points contain multiple data centers: dc1, dc2, dc3

Any help would be appreciated

Thanks in advance.


Solution

  • The Spark Cassandra Connector (SCC) allows to use only nodes from local data center, either defined by the spark.cassandra.connection.local_dc configuration parameter, or determined from the DC of the contact point(s) (that is performed by function LocalNodeFirstLoadBalancingPolicy.determineDataCenter). SCC newer will use nodes from other DCs...