Search code examples
microservicesevent-sourcingeventstoredb

Subscribing to an event stream from multiple nodes


We are evaluating using events as a source to build up reports and have spiked a number of different options.

We are currently running our system in a service fabric cluster (with the intent to move to kubernetes at some point in the future) meaning the subscriber of the various event streams would by default live on multiple nodes. We have looked at various event streaming implementations (kafka, SqlStreamStore and EventStoreDb) and have run into a common issue whereby the multi node subscribers will all be trying to handle new messages and build a up a shared projection in parallel, meaning we would need to rely on a check against a previously handled messages table or on primary key constraints.

Possible solutions are that we stick to a single node subscriber but I can't see this scaling as the events start to pile up, or we build up projections on the fly straight from event stream. Has anyone encountered or found a solution to this problem?


Solution

  • To run parallel projections, I'd recommend thinking about the partitioning/sharding strategy. The most obvious approach is making sure that the single projection type will be handled by the specific subscriber (this can be done by, e.g. listening for the events handled by this projection type or per stream type). Having that, you can distribute the load and not end up with the competing consumers' problem.

    If your consumers compete for the shared resource - e.g. the read model you're writing to then it's hard to guarantee processing order. In theory, you could write a stream version to the projection and then verify if it's higher than the version from processing events. However, You might end up in the scenario that:

    • first handler - gets event 1 and 2
    • second handler - gets event 3

    If the first handler is lagging, then the second will write the projection, set the version to 3. If you perform a check based on version, then events 1 and 2 will be ignored.

    Also, with competing consumers, events get out of order (e.g. because of retries), then you may end up with version number gaps and not verify if it's okay to write or not easily.

    This is hard for projections from a single stream. For projections from multiple streams, it gets even more tricky. You might end up with the need to maintain different versions for the other streams in the same read model. Which is not manageable in the long term.

    This can work if your events are typical transport events - so "upserts." However, if you have business events, then losing an event may be critical.

    The other scenario is when you don't care about ordering and can live with data for some time in the potentially wrong state (e.g. at first apply what's possible from the "update event", then fill with the data from "create event"). However, that usually requires additional effort on the projection business logic and ensuring that out-of-order cases are adequately handled.

    I recommend starting with a single writer. If you see performance issues, define your partitioning/sharding strategy and ensure that you don't have competing subscribers for the same target read model. It's also essential to make your projections idempotent (so if you get the same event twice, it will have a single effect).