Search code examples
solrdatastaxdatastax-enterprisedatastax-startupdatastax-enterprise-graph

How to enable Spark in Datastax Datacenter?


Our current Datastax datacenter setup contain 6 nodes in which both Solr and graph enabled

root@ip-10-10-5-36:~# cat /etc/default/dse | grep -E 'SOLR_ENABLED|GRAPH_ENABLED'

GRAPH_ENABLED=1
SOLR_ENABLED=1

root@ip-10-10-5-36:~# nodetool status

Datacenter: SearchGraph
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address     Load       Tokens       Owns    Host ID                               Rack
UN  10.10.5.56  456.58 MiB  1            ?       936a1ac0-6d5e-4a94-8953-d5b5a2016b92  rack1
UN  10.10.5.46  406.24 MiB  1            ?       3f41dc2a-2672-47a1-90b5-a7c2bf17fb50  rack1
UN  10.10.5.76  392.99 MiB  1            ?       29f8fe44-3431-465e-b682-5d24e37d41d7  rack2
UN  10.10.5.66  414.16 MiB  1            ?       1f7de531-ff51-4581-bdb8-d9a686f1099e  rack2
UN  10.10.5.86  424.3 MiB   1            ?       27d37833-56c8-44bd-bac0-7511b8bd74e8  rack2
UN  10.10.5.36  511.44 MiB  1            ?       0822145f-4225-4ad3-b2be-c995cc230830  rack1

We are planning to implement spark in our existing datacenter. My question is

1) Will enabling spark affect existing data and service in datastax ?.

2) Or instead of enabling SPARK_ENABLED=1, did we need to setup separate datacenter for Spark ?

Updated :

3) How DC1 and DC2 connect each other in ring, is it based on same Cluster name specified in cluster_name: parameter. Conf file : /etc/dse/cassandra/cassandra.yaml

4) Is there any separate configuration need to specify spark master in data
center.

5) Did i need to specify SearchGraph (DC1) seed ip in Spark(DC2) seed
configuration section ? Or just Spark seed ip only need to specify in DC2 Configuration section(cassandra:yaml)


Solution

  • It's recommended to create separate datacenter for DSE Analytics. The full process is described in documentation.