Performance issues with large datasets

Is there any way of filtering the events in a projection associated with a read model by the aggregateId?

In the tests carried out we always receive all registered events. Is it possible to apply filters in a previous stage?

We have 100,000 aggregateId and each id has associated 15,000 events. Unable to filter by aggregateId, our projections have to iterate over all events.

Solution

So you have 100.000 aggregates with 15.000 events each.

You can use ReadModel or ViewModel:

Read Model:

Read model can be seen as a read database for your app. So if you want to store some data about each aggregate, you should insert/update row or entry in some table for each aggregate, see Hacker News example read model code.

It is important to understand that resolve read models are built on demand - on the first query. If you have a lot of events, it may take some time.

Another thing to consider - a newly created resolve app is configured to use in-memory database for read models, so on each app start you will have it rebuilt.

If you have a lot of events, and don't want to wait to wait for read models to build each time you start the app, you have to configure a real database storage for your read models.

Configiuring adapters is not well documented, we'll fix this. Here is what you need to write in the relevant config file for mongoDB:

readModelAdapters: [
  {
    name: 'default',
    module: 'resolve-readmodel-mongo',
    options: {
      url: 'mongodb://127.0.0.1:27017/MyDatabaseName',
    }
  }
]

Since you have a database engine, you can use it for an event store too:

storageAdapter: {
  module: 'resolve-storage-mongo',
  options: {
    url: 'mongodb://127.0.0.1:27017/MyDatabaseName',
    collectionName: 'Events'
  }
}

ViewModel ViewModel is built on the fly during the query. It does not require a storage, but it reads all events for the given aggregateId.

reSolve view models are using snapshots. So if you have 15.000 events for a give aggregate, then on the first request all those events will be applied to calculate a vies state for the first time. After this, this state will be saved, and all subsequent requests will read a snapshot and all later events. By default snapshot is done per 100 events. So on the second query reSolve would read a snapshot for this view model, and apply not more than 100 events to it.

Again, keep in mind, that if you want snapshot storage to be persistent, you should configure a snapshot adapter:

snapshotAdapter: {
  module: 'resolve-snapshot-lite',
  options: {
    pathToFile: 'path/to/file',
    bucketSize: 100
  }
}

ViewModel has one more benefit - if you use resolve-redux middleware on the client, it will be kept up-to-date there, reactively applying events that app is receiving via websockets.