aggregate domain-driven-design bounded-contexts event-storming

What is an Aggregate?

Aggregates are described in many different ways. Two significant are cited below:

From https://medium.com/ingeniouslysimple/aggregates-in-domain-driven-design-5aab3ef9901d

Aggregate is used for domain simplification.
An aggregate is an encapsulation of domain objects (entities and value objects) which conceptually belong together.
An aggregate can be treated as a single unit.
Each aggregate has one aggregate root.

From https://www.ibm.com/cloud/architecture/architecture/practices/event-storming-methodology-architecture/

Aggregate can be identified by grouping events and commands that are related together.
This grouping not only consists of related data (domain objects: entities and value objects) but also related actions (commands) that are connected by the lifecycle of that aggregate.
Aggregates suggest microservice boundaries.

They seem to be similar (see the second sentence of the second own). The difference is “Aggregate can be can be identified by grouping events and commands that are related together”.

To my question: my understanding of an aggregate is for example a car with its parts and the car is the Aggregate Root. Why do events and / or command help to identify aggregates and why is this necessary?

Solution

In the Domain Driven Design world, the authoritative definition of aggregate is from Eric Evans book Domain Driven Design: Tackling Complexity in the Heart of Software.

An AGGREGATE is a cluster of associated objects that we treat as a unit for the purpose of data changes.

Note that the definition of AGGREGATE appears in Chapter 6: it is a lifecycle management pattern (like FACTORY and REPOSITORY) not a modeling pattern (like ENTITY and VALUE OBJECT).

One of the types of complexity that needs to be tackled in software is managing change of information. If any information in your system can be changed from anywhere in your program, then the number of code paths you need to consider explodes geometrically, and correctness becomes more expensive.

Via chapter 5, we've already got the idea of clumping information and the behaviors that change that information together into ENTITIES (not an idea original to Evans, of course), but we still have complexity tax to pay if we can send messages to ENTITIES from anywhere.

So his suggestion is that we hide some of the entities within aggregates, and ensure that access to those entities can only happen via designated choke points (the "root" entities of the aggregates, which have a relaxed set of access constraints).

To affect a change, you send a message to the root entity of the aggregate, and that entity collaborates with the other entities in the aggregate to achieve the change.

If you squint, you may recognize this as a form of encapsulation: we're putting a bunch of entities together into the aggregate, and to simplify the messaging paths we arrange so that the capsule is opaque except for access to the root entity.

If you get that idea, you are then faced with another hard problem: for my Online Pet Store / Cargo Shipment Tracking System / Your Domain Here model, how do you figure out which entities belong in the same capsule?

In other words, how do we figure out which boundaries to implement so that we actually accrue all of the wonderful cost savings that we've been promised aggregates would deliver?

And there are a number of ways that you might try to do that.

If you are using Event Modeling as a practice for exploring your domain, one approach you might take is looking at all of the commands/events that you have captured in your modeling session, and figuring out where the data overlaps are, and which information needs to be locked at the same time to ensure that correct changes are not corrupted by the plumbing, and so on.

my understanding of an aggregate is for example a car with its parts and the car is the Aggregate Root.

Careful.

In most cases that I have seen, real world things are not what we want to be talking about.

We're normally implementing an information system; we're not modeling cars, we're modeling information about cars; service logs, or title history, or accident claims. We're tracking information that changes over time, and that information may all be related to the same Vehicle Identification Number, but that doesn't necessarily mean that it is part of the same aggregate.

When we look at the commands and events from our storming session, and notice that none of the changes to title data care about service history, and none of the changes to service history care about the title, that's a hint that we may benefit from modeling "the car" using multiple aggregates with their own roots/boundaries/life cycles.