Search code examples
javajakarta-eecassandrahector

What is the fastest way to copy Column family in Cassandra?


I want to create copy of Column Family with another name in Cassandra using Hector(or any other client), Is there any fastest way to do this?

Thanks


Solution

  • The cassandra hadoop integration reads a whole column family to use as input to a map reduce job; it can also output in bulk to a column family. Have a read of the code in the org.apache.cassandra.hadoop package to get an idea of what to do.

    For the read, it figures out which tokens are on which nodes and then does a get_range_slice using that token range (it splits up the token range into manageable chunks too). For the write it does (or can do if you use the Bulk* classes) a similar thing to the above solutions by constructing an SSTable and then uploading that to cassandra.

    I suspect the other answers above using sstable2json would be far and away more efficient, but this would work.