Search code examples
apache-sparkcassandraspark-cassandra-connector

how to connect to more than 1 cassandra hosts using spark cassandra connector


I have a spark application that reads data from one cassandra cluster and after some computation saves data to another cassandra cluster. I can set only 1 cassandra configuration in sparkconf. but I need to connect to 1 more cassandra cluster.

I see a CassandraConnector class that is used for connecting to cassandra but it uses CassandraConnectorConf object to create an object which takes a lot of parameters that I don't know.

Any assistance will be helpful


Solution

  • Use the following code :

    SparkConf confForCassandra = new SparkConf().setAppName("ConnectToCassandra")
                    .setMaster("local[*]")
                    .set("spark.cassandra.connection.host", "<cassandraHost>");
    
    CassandraConnector connector = CassandraConnector.apply(confForCassandra);
    
    javaFunctions(rdd).writerBuilder("keyspace", "table", mapToRow(Table.class)).withConnector(connector).saveToCassandra();