Here is the scenario: We have two data centers in a production cluster: one PROD and the other DR.
We create keyspaces that replicate to both data centers. No issues there. Here's the thought/question:
If we want to put/create a Prod-like environment (PL - which is a replica of production) that utilizes the existing servers - BUT - not affect production, my thought is create a new keyspace, but only specifying the DR datacenter in the CREATE KEYSPACE command. We want to be sure that data is not propagated from the DR datacenter to the PROD data center thus affecting its storage and performance. I believe this will do what I think, which means that the data will stay only on the DR datacenter thus leaving production alone. Anyone see any issues with this?
Essentially this:
CREATE KEYSPACE PL_KS WITH replication =
{'class': 'NetworkTopologyStrategy', 'DR': '2'} AND durable_writes = true;
When PL transactions/queries execute against the PL keyspace, the drivers should be smart enough as well to not connect them to the production nodes, correct? So in essence, all PL activities should be against the DR datacenter nodes.
Your assumptions are correct. With only specifying replication to the DR
data center with the PL_KS
keyspace, only the nodes in the DR
data center will be used.
The only exception to this, would be if the endpoints defined in your app code are in your PROD
data center. Then the app would use the PROD
data center nodes for initial discovery only, and all forthcoming operations would be run against the DR
nodes.