Search code examples
datastaxdatastax-enterprise

Is it possible to have nodes from multiple datacenters join the same Spark cluster?


I am running a Datastax Enterprise cluster (with GossipingPropertyFileSnitch). I have two datacenters, Analytics and Cassandra. Analytics nodes forms a Spark cluster. I am considering merging the two clusters to better utilize resources.

When I enable Spark (in /etc/dse/default) on my Cassandra nodes I get a new master and it seems like those nodes aren't joining the same Spark cluster as the Analytics nodes. Can I somehow make the Cassandra datacenter nodes join the Analytics Spark cluster?


Solution

  • Because you're using GossipingPropertyFileSnitch, you must also change which DC the new Spark nodes are in. Otherwise they will continue to be in the so-named "Cassandra" datacenter.

    Edit: The short answer to your headline questions is "No". Separate DC's are assigned separate spark masters and don't share resources on spark jobs.