I am building Microservices. One of my MicroService is using CQRS and Event sourcing. Integration events are raised in the system and i am saving my aggregates in event store also updating my read model.
My questions is why we need version in aggregate when we are updating the event stream against that aggregate ? I read we need this for consistency and events are to be replayed in sequence and we need to check version before saving (https://blog.leifbattermann.de/2017/04/21/12-things-you-should-know-about-event-sourcing/) I still can't get my head around this since events are raised and saved in order , so i really need concrete example to understand what benefit we get from version and why we even need them.
Many thanks,
Imran
Let me describe a case where aggregate versions are useful:
In our reSove framework aggregate version is used for optimistic concurrency control.
I'll explain it by example. Let's say InventoryItem
aggregate accept commands AddItems
and OrderItems
. AddItems
increases number of items in stock, OrderItems
- decreases.
Suppose you have an InventoryItem
aggregate #123 with one event - ITEMS_ADDED
with quantity of 5. Aggregate #123 state say there are 5 items in stock.
So your UI is showing users that there are 5 items in stock. User A decide to order 3 items, user B - 4 items. Both issue OrderItems
commands, almost at the same time, let's say user A is first by couple milliseconds.
Now, if you have a single instance of aggregate #123 in memory, in the single thread, you don't have a problem - first command from user A would succeed, event would be applied, state say quantity is 2, so second command from user B would fail.
In a distributed or serverless system where commands from A and B would be in separate processes, both commands would succeed and bring aggregate into incorrect state if we don't use some concurrency control. There several ways to do this - pessimistic locking, command queue, aggregate repository or optimistic locking.
Optimistic locking seems to be simplest and most practical solution:
We say that every aggregate has a version - number of events in its stream. So our aggregate #123 has version 1.
When aggregate emits an event, this event data has an aggregate version. In our case ITEMS_ORDERED
events from users A and B will have event aggregate version of 2. Obviously, aggregate events should have versions to be sequentially increasing. So what we need to do is just put a database constraint that tuple {aggregateId, aggregateVersion}
should be unique on write to event store.
Let's see how our example would work in a distributed system with optimistic concurrency control:
User A issues a command OrderItem
for aggregate #123
Aggregate #123 is restored from events {version 1, quantity 5}
User B issues a command OrderItem
for aggregate #123
Another instance of Aggregate #123 is restored from events (version 1, quantity 5)
Instance of aggregate for user A performs a command, it succeeds, event ITEMS_ORDERED {aggregateId 123, version 2}
is written to event store.
Instance of aggregate for user B performs a command, it succeeds, event ITEMS_ORDERED {aggregateId 123, version 2}
it attempts to write it to event store and fails with concurrency exception.
On such exception command handler for user B just repeats the whole procedure - then Aggregate #123 would be in a state of {version 2, quantity 2}
and command will be executed correctly.
I hope this clears the case where aggregate versions are useful.