Search code examples
eventsapache-kafkamicroservicesapache-zookeeperevent-sourcing

Explications about EventSourcing, Microservice, CQRS


I am currently building an app, and i would like to use microservices as pattern and GraphQl for communication. I am thinking about using kafka / rabbitmq + authZ + auth0 + apollo + prisma. And all of this running on docker. I found many ressources on event sourcing, the advantage/disavantage, and I am stuck on how it work in the real world. As far, this is how i will do it:

Microservice CQRS

  • Apollo engine to monitor request / responses..
  • Auth0 for authentification management
  • AuthZ for authorization
  • A graphql gateway. Sadly I did not find a reliable solution, I guess i have to do it my self using apollo + graphql-tool to merge schema.

And ideally:

  • Prisma for the read side of bill's MS
  • nodejs for the write side of bill's MS

Now if I understand correctly, using apache kafka + zookeeper :

  • Kafka as the message broker
  • Zookeeper as an eventstore.

If I am right, can I assume:

  • There would be 2 ways to validate if the request is valid:

    • Write's side only get events (from event store, AKA zookeeper) to validate if the requested mutation is possible.
    • Write's side get a snapshot from a traditional database to validate the requested mutation.

    Then it publish an event to kafka (I assume kafka update zookeeper automatically), and then the message can be used by the read's side to update a private snapshot of the entity. Of course, this message can also be used by others MS. I do not know apache kafka + zookeeper very well, in the past i only used messaging service as rabbitmq. They seems similars in the shape but very different in the usage.

  • The main difference between event sourcing and basic messaging is the usage of the event-store instead of a entity's snapshot? In this case, can we assume that not all MS need an event's store tactic (i mean, validating via the event store and not via a "private" database)? If yes, does anyone can explain when you need event's store and when not?

Solution

  • I'll try to answer your major concerns on a concept level without getting tied up with the specifics of frameworks and implementations. Hope this will help.

    There would be 2 ways to validate if the request is valid:

    . Write's side only get events (from event store, AKA zookeeper) to validate if the requested mutation is possible.

    . Write's side get a snapshot from a traditional database to validate the requested mutation.

    I'd go by the first option. To execute a command, you should rely on the current event stream as authority to determine your model's current state.

    The read model of your architecture is only eventually consistent which means there is an arbitrary delay between a command happening and it being reflected on the read model. Although you can work on your architecture to try to ensure this delay will be as small as possible (even if you ignore the costs of doing so) you will always have a window where your read model is not still up to date.

    That being said, your commands should be run against your command model based off your current event store.

    The main difference between event sourcing and basic messaging is the usage of the event-store instead of a entity's snapshot? In this case, can we assume that not all MS need an event's store tactic (i mean, validating via the event store and not via a "private" database)? If yes, does anyone can explain when you need event's store and when not?

    The whole concept of Event Sourcing is: instead of storing your state as an "updatable" piece of data which only reflects the latest stage of such data, you store your state as a series of actions (events) that can be interpreted to reach such state.

    So, imagine you have a piece of your domain which reads (on a free form notation):

    Entity A = { Id: 1; Name: "Something"; }

    And something happens and a command arrives to change the name of such entity to "Other Thing".

    In a traditional storage, you would reach for such record and update it to:

    { Id: 1; Name: "Other Thing"; }

    But in an event-sourced storage, you wouldn't have such a record, you would have an event stream, with data such as:

    {Entity Created with Id = 1} > {Entity with Id = 1 renamed to "Something"} > {Entity with Id = 1 renamed to "Other Thing"}

    Now if you "replay" these events in order, you will reach the same state as the traditional storage, only you will "know" how your got to that state and the traditional storage will forget such history.

    Now, to answer your question, you're absolutely right. Not all microservices should use an event store and that's even not recommended. In fact, in a microservices architecture each microservice should have its own persistance mechanism (many times being each a different technology) and no microservice should have direct access to another's persistance (as your diagram implies with "Another MS" reaching to the "Event Store" of your "Bill's MS").

    So, the basic decision factor to you should be:

    • Is your microservice one where you gain more from actively storing the evolution of state inside the domain (other than reactively logging it)?

    • Is your microservice's domain one where your are interested in analyzing old computations? (that is, being able to restore the domain to a given point in time so you can understand its state's evolution pattern - consider here something as complex auditing where you want to understand past computations)

    • Even if you answer "yes" to both of these questions... will the added complexity of such architecture be worth it?

    Just as a closing remark on this topic, note there are multiple patterns intertwined in your model:

    • Event Sourcing is just the act of storing state as a series of actions instead of an updatable central data-hub.
    • The pattern that deals with having Read Model vs Command Model is called CQRS (Command-Query Responsibility Segregation)

    These 2 patterns are frequently used together because they match up so nicely but this is not a prerequisite. You can store your data with events and not use CQRS to split into two models AND you can organize your domain in two models (commands and queries) without storing any of them primarily as events.