In my current project, I use a combination between Kafka and Cassandra to implement the Event Store that I need to the Domain services (CQRS). Now I'm in a phase of doing event evolutions and I want to rebuild the query side again. I found that I have the same events in both Kafka and Cassandra which is a redundancy that I'm not comfortable with. Anyways, I have now 2 options :
I do not believe there is a conclusive answer to this question as both approaches would work. However, here is my 2 Pennies' worth of it. Also, I am not entirely comfortable with the idea of having two sources of truth (Cassandra and Kafka) - you may want to review the rationale behind that decision.
TL;DR: reading from Kafka is good if you need to rebuild the "whole" view model. However, it requires special precautions when writing and reading events.
Kafka can run as a (semi-)permanent store in that events stay in the log for a user-configurable amount of time. In addition to that, adding an event to Kafka's log is a very cheap operation, O(1)
. Therefore, you can keep data for as long as you wish without major impacts on the performance of your solution.
However, using Kafka as an event store requires some special precautions. Firstly, Kafka's order guarantee applies only to partitions and not to the whole topic; therefore, each streamID needs to written in the same partition.
The same applies to the reading side: in order to re-build the read model, you need to read sequentially (reading from different partition at the same time would be parallel, especially if you use a consumer group) the entire partition from offset 0.
With that in mind, if you need to rebuild the whole model (hence you do not need to pick specific streams) or build projections, I would simply use Kafka.