Search code examples
mongodbcassandrareplicationdatabase-replication

MongoDB to other DB syncing


we are planning to continuously sync data a collection from MongoDB to another database (in this case Cassandra).

I'm thinking of listening to the mongo-oplog then push those changes to Cassandra. It's risky since the data from MongoDB might be invalid for Cassandra or the Cassandra cluster my down any moment. In the event of Cassandra failure, we gotta call some sort of alert, route all read request to MongoDB then re-sync data to Cassandra from the point of failure. That's a lot of work and any more work may add another point of failure in there.

So is there any best practice for this case, or any sort of libraries or services out there that done this seamlessly? Thanks.


Solution

  • If you can publish the MongoDB updates to a Kafka topic, DataStax has an open-source Kafka connector for Cassandra. It would be a more resilient and highly-available solution.

    For more info, see the Kafka connector for Cassandra docs and kafka-sink repository on GitHub.

    There's also a 15-minute Katakoda tutorial here if you're interested -- https://www.datastax.com/dev/scenario/datastax-kafka-connector. Cheers!