Search code examples
c#asp.net-corecqrs

CQRS pattern - need to read data when processing a command?


I'm practicing the CQRS pattern and I can't understand it. I need to execute a command to create an entity, which in turn has navigation properties. It turns out that when creating, I request data from the database by ObjectId. But it turns out I'm doing query in command.

public async Task<ResponseBase> Handle(CommandCreateItem request, CancellationToken cancellationToken)
{
    var dto = request.Item;
    var newItem = new Item();

    _color = await _context.Colors.FindAsync(dto.ColorId);
    _seasonItem = await _context.SeasonItems.FindAsync(dto.SeasonItemId);
    _itemType = await _context.ItemTypes.FindAsync(dto.ItemTypeId);


    var price = decimal.Parse(dto.Price, NumberStyles.Any, CultureInfo.InvariantCulture);
    var countItem = uint.Parse(dto.CountItem);

    var characteristic = new CharacteristicItem
    {
        Color = _color, SeasonItem = _seasonItem,ItemType = _itemType, Id = Guid.NewGuid(),Item = newItem
    };
    newItem = new Item
    {
        Id = Guid.NewGuid(),
        Title = dto.Title,
        ArticleNumber = dto.ArticleNumber,
        Description = dto.Description,
        NumberOfSales = 0,
        CountItem = countItem,
        Price = price,
        CharacteristicItem = characteristic,
    };
    await _context.CharacteristicItems.AddAsync(characteristic, cancellationToken);
    await _context.Items.AddAsync(newItem, cancellationToken);
    await _context.SaveChangesAsync(cancellationToken);

    return new ResponseItemCreate(newItem.Id);
}

Is this normal? How to do it right? After all, its essence is to share responsibility.


Solution

  • TL;DR there's nothing wrong with reading from a persisted data store / source of truth (e.g central database) when you execute Commands (C) e.g. transactions in a stateless design.

    Its the Read (Q) side of CQRS that requires you to find novel ways to perform query and read activity from alternate data sources, to free up capacity from the source of truth.

    E.g. in OP's example, if the foreign key data such as color, seasonItem and itemType doesn't change frequently, you could consider caching them in memory or distributed cache. Also, in Entity Framework, you should be able to associate navigation properties via foreign key ids (instead of fetching the full tracked entity) which could avoid the need to fetch FK objects altogether, since the ids are already available on the incoming request.

    Detail

    Simply put, CQRS implies that reading data need not be done from the same data source, nor in the same manner that writing is done.

    By adhering to this principle, CQRS allows systems which have more read activity (e.g. UI views, GET query APIs etc) to scale far beyond an equivalent system which both reads and writes from a centralised data store (source of truth) would have otherwise allowed, e.g. from a single SQL RDBMS.

    As most scaled-out transactional (OLTP) systems are stateless, they maintain state in a persisted data store, it follows that when processing Commands, most systems will need to read the current state (truth) to assert any preconditions and rules or validations before applying any changes.

    So there is nothing wrong with your stateless, 'read before write' approach to processing Command transactions. In any event, the CAP theorem restricts your scaling options for write transactions while keeping the integrity of your data state.

    Stateless transactional systems typically need to use pessimistic locking or optimistic concurrency patterns to ensure that any transactions based on assumptions made from read data are still valid (i.e. hasn't changed) in between the 'read' and 'write' activity. With pessimistic locking, you'll need to read and write from the true source. With optimistic concurrency, you can also base your Commands off CQRS read stores, but you'll need to track and assert that nothing has changed in the interim through a data version or timestamp.

    The main benefit of a separate Read store in CQRS is for non-transactional read activity, e.g. Point Reads (GetById) or Queries for UI, reports, analytics etc. which do not require absolutely fresh data. If reads can be done from a quicker 'cache' (often distributed, replicated, and updated asynchronously, hence eventually consistent), then the overall scale of the system improves. These read caches (often called read sides or read stores) allow read-heavy systems to scale beyond the limitations of a shared read-write database.

    When processing Commands which are dependent on the existing state of an entity / aggregate root, the only way to avoid the stateless read-before write activity while still retaining consistency and integrity from the the source database, is to change the true source off storage and into into memory, e.g. specific implementations of the Actor Model. However moving to a stateful design has different challenges, such as routing (for scale) and failover / fault tolerance.

    So for read-scalable, stateless CQRS systems, ensure your command writes are efficient and consistent, but spend most of your effort in finding novel ways to service read and query activity from caches (e.g. in-memory, distributed, precomputed projections, etc). Reactive or Event-Driven Architecture is one way to help achieve this result.

    Alternately, stateful Command processing architectures exist which avoid the read before write altogether, to further reduce latency, especially where there is a high frequency of messages correlated to relatively few entities / aggregate roots. Here the current source of truth is held in memory. As new Commands are processed, new state is derived and persisted to the truth store (e.g. relational, document, or event sourced db). In order to scale and be resilient, an upstream router and entity management / assignment mechanism is required. Apache Kafka or gRPC streaming is frequently used for this. Serial processing of incoming Commands to the same aggregate root avoids any need for pessimistic locking.