Search code examples
microservicesscalecqrsevent-sourcing

How to implement Event sourcing and a database in a microservice architecture?


I have been learning lately about microservices architecture and it's features. in this source it appears that event sourcing is replacing a database, however, it is later stated:

The event store is difficult to query since it requires typical queries to reconstruct the state of the business entities. That is likely to be complex and inefficient. As a result, the application must use Command Query Responsibility Segregation (CQRS) to implement queries.

In the CQRS Page the author seems to describe a singular database that listens to all events and reconstructs itself.

My question(s) is:

What is actually needed to implement event sourcing with a queryable database? particularly:

Where is the events database? Where is the queryable database? Do I need to have multiple event stores for every service or can I store events in a message broker like Kafka? is the CQRS database actually is one "whole" database that collects all the events? And how can all of this scale?

I'm sorry if I'm not clear with my question, I am very confused myself. I guess I'm looking for a full example architecture of how things will look in the grand picture.


Solution

  • Where is the queryable database?

    I'm guessing this is the most useful starting point, because it will be most familiar. The queryable database is in the same place that your this-is-the-entire-database was when you weren't doing event sourcing.

    That could be a database exclusively to support this microservice, or it could be a database that is shared by several microservices, with some part of the schema where this microservice has exclusive write authority. Another way of thinking about this: the microservices are using different logical databases, which might be physically deployed together.

    Where is the events database?

    Same general idea - you can have one events database per microservice; or you could have several different microservices sharing the same database. Again, you have partitioning of authority, and the same logical vs physical separation to consider.

    What changes with the introduction of events and CQRS is that the query/reporting database no longer stores the authoritative copy of the information that is used by the microservice. The authoritative information lives in the event store, and the query/reporting database acts more like a cache.

    Our command handlers will typically load information only from the authoritative store (aka the events); that's the data that we lock if we are processing commands concurrently.

    We copy information that is stored in the events into the query/reporting database(s). Depending on our needs, that can be done synchronously by the command handlers, but it is more common to use background batch processing to do that work, meaning that the data in the reporting database will often be a little bit stale.

    can I store events in a message broker like Kafka?

    Current consensus is that Kafka cannot reliably be used for event sourcing as understood by the CQRS community.

    Roughly, the problem is this: when you have two processes with the authority to write events, how do you ensure that they don't introduce inconsistencies? With event stores we can use locks, or conditional writes (aka compare and swap), to ensure that nobody came along and snuck in a few extra events that might change the events we are writing.

    With Kafka, there doesn't seem to be a mechanism that supports prevention, so you need to lean more into apologies, or something.

    the CQRS database actually is one "whole" database that collects all the events?

    Logically? No. But you certain can combine them physically into the same appliance. For example, message-db is "just" a postgres schema with some tables, functions, and so on. You certainly could combine that with the tables you use for queries and reports.

    I'm looking for a full example architecture of how things will look in the grand picture.

    The materials published by Greg Young in 2010 might be a decent starting point.