Search code examples
cassandradatastax-enterprisedata-migrationsolid-state-drive

cassandra data migration from regular drive to SSD on production server


we want migrate our data from regular drives on production server to new SSD drives. how we can do that without taking down the node no longer than 4 hours(hinted hand-off is 4 hours) our data is in few hundreds of GigaBytes.

what i was thinking is stoping cassandra on one node at a time flushing data to disks and then transfering data from old drives to new drives and de-mounting old disk and bringing back node online. Is this a right approach ?? If so what is my major concern is,the data migration to new disk takes more than 4 hours in the mean while i will lose hints.

Is there any better approach for migration of data to new disks??


Solution

  • Add the disk.

    Use rsync -avz --delete /old/data/dir /new/data/dir to copy the sstables from one (spinning) drive to the other (ssd) drive. You can run this while cassandra is running - no risk other than increased latency due to IO contention. You can control the increase in latency using nice and ionice.

    Once you've run rsync, you'll have an idea of the upper bound of the migration process. You'll also have an initial snapshot. Run it again, and time it a second time - it's likely going to be significantly faster - it'll leave files that haven't changed, delete files that have been deleted, and copy any new files. If this is faster than 4 hours (and it probably will be), then you can proceed to run nodetool flush and nodetool drain, stop cassandra, and run rsync a third time. Once rsync completes, change the path to the data file directory in the yaml, and start cassandra - hints will be delivered and you're good to go.

    Alternatively, you can do the exact same thing, and if it takes longer than the 4 hour hint window, follow it with a nodetool repair to pick up any writes you missed when hints expired.