Search code examples
cassandradatastaxdatastax-enterprisecassandra-2.1

How COPY works in cassandra when table is replicated accross multplie nodes in a cluster?


Suppose if want to copy a table from a cluster of 7 nodes with RF = '3' to another cluster of 6 nodes with RF '3' how can i do that ?? can i copy data from anyone of the node to CSV file and then import that data from CSV file to any node in the new cluster ?? or should copy data from each and every node in cluster to new cluster ??

should i decrease replication to 1 then copy data and change replication to 3 but i think this will not work in production ?? how can i tackle this ??


Solution

  • Its not something you have to run on each node. You can use cqlsh's COPY command on a system outside the cluster. Restoring cluster from sstables/commitlogs is where you need to worry about that (which sstableloader solves as well).

    It will read all the data when using COPY TO and when using COPY FROM it will send each row through the write path which will distribute according to your RF. Its done far more efficiently then using a basic read/write script but thats ultimately still what its doing.