design-patterns microservices scaling cqrs event-sourcing

Microservice CQRS separate building (writing) the query model and reading the model

Following scenario

Receiving 60,000 messages per minute from a queue. REST API serving the data from those messages 10 times per minute.

I have a microservices architecture with event-sourcing and CQRS. So my commands are already separated from the query part. The problem lies in syncing the query and querying it, not the command part.

Every few minutes about 60,000 commands are sent and stored as events using event-sourcing pattern. Via CQRS the actual data (not the event) is synced to another service which stores it in a database. meanwhile the data is read only a dozen times every few minutes.

In other words. This one service receives 60,000 write operations but only a dozen read operations.

I would really like to adhere to the design patterns of microservices, aka one database per service but for scaling reasons I don't see that feasible in my scenario. Writing to the database needs to significantly scale more than reading the database.

I saw a similar question but the answer proposes to use CQRS, which I already have implemented. Somebody told me before to remove the event-sourcing but that still leaves me with 60,000 writes and 10 reads.

What should be my architecture for scaling the reads and writes independently? I'm thinking of creating two separate services but this would be against the one database per service pattern.

Solution

Assumption

As far as I understand the problem is that your write model needs to reflect the read model state as soon as possible and since you have just 10 reads per minute they need to reflect the real state in real time or close to real time.

The Problem

Based on this assumption splitting this domain to 2 micro-services and using CQRS will not solve that problem. Why?

Because if you have a Write micro-service and Read micro-service, and you update the Read micro-service with the events you publish from Write micro-service you will have a problem with latency. This means that you will have some kind of minimum latency involved around 50-100 milliseconds(most of the time the latency will be bigger) until your Read micro-service db is in sync with your Write micro-service db. This is normal and this is something you need to take into account when working with distributed systems using queues.

Based on this splitting this part of Domain to 2 micro-services is not the best approach(again this is based on my assumption that you need to Read micro-service data almost in Real time).

Possible Solutions

What could you do? There are a couple of things you could do here:

Option - Database replication and CQRS
- The database part Again if you need the data up to date you could use something like Read-only replication of your Write database. SQL Server provides something like this out of the box. I was using it on one of my projects. This would mean that you would have another database where SQL server would replicate your data to another database on a database level (this would be much faster then doing the whole, publish to a queue and consume the message from another micro-service). This database would an exact copy of your Write database and it would be read-only. This way your 10 Read operations would almost always be up to date with a very very low latency. You can read about this feature from SQL Server here.
- The CQRS part for backend When it comes to CQRS you would still continue using it having your 2 micro-services (Write and Read). The Write micro-service would use your Primary SQL Server instance and your Read micro-service would use the Read only replica of the Primary database. Keep in mind that this read only replica is a separate database running on a separate machine. So based on this you would satisfy the rule: "1 database per one micro-service". Of course this options is possible if your are using Microsoft Sql Server.
Option - Using Event Sourcing to produce materialized Views and Server side CQRS
- The database part In this approach you would use a materialized Views which would serve for reads on the same database as your main or Write micro-service database. For example if you use PostgreSQL you could use Marten(https://jasperfx.github.io/marten/) for event sourcing and storing the events. It is for .NET but I guess there are other solutions for other languages as well. The good thing about Marten is that you can generate Materialized Views (called Projection-views) which are generated as soon as the Aggregate/Model changes. Means if you have for example some Customer object changed using Event sourcing you would publish an event which would be stored to the database(using Marten). After that marten will update your Projection-view (which would be a Table in your db like CustomersProjection) applying just the last event. This is very performant and your view would be up to date as soon as the event is published. This way you would make use of your already existing Event Sourcing implementation.
- The back-end CQRS part server/back-end side would be splitted to 2 micro-services as in the previous approach. Unlike the other approach here everything would be in one Physical database. When it comes to CQRS you would still continue using it but only on your server/back-end level. When it comes to database level you would physically access the same database from both micro-services. There would be just a logical split and using the same db for both has some drawbacks as well. Even though you would have everything in one database you would access only the Projection-views from the Read micro-service and all the other tables from the Write micro-service. There are ways how you can get around this to add restriction on Code level not to access specific tables. For example if you use some ORM with .NET or Java you could do this easily. For other technologies there are similar solutions.
- The problem with this is that you would use one database for both micro-services.
Option - Not use CQRS at all for this part of domain
- If you have your Application/Domain split into micro-services for some parts of the domain it makes sense to use CQRS to separate the reads and writes not only from the Database point of view but also from the server scaling point of view. From database point of view the good thing if you use 2 databases you can even pick different technologies for your Write and Read database. From serve side the benefit is that you can scale the read part independently from the Write part and vice versa. In your case if your have only 10 reads per 1 minute means that you don't have a big load on your data using a separate database and micro-service is not necessary(and I would even say an overkill in this case). 10 loads more on the same server per minute where the Writes would be handled together with the reads makes almost no difference. But I don't know if this is only for now or the requirements are going to change and the reading requests are going to increase. For now there is no need for it and I would put everything in one micro-service and database. In one of my previous project we had exact that secnario where we had a huge micro-service based architecture and in parts we where using CQRS splitting the specific Sub-Domain in 2 micro-services having their dedicated databases, but in other parts of domain we did not use CQRS as it simply did not make any sense for us in that part of the Domain.

Keep in mind that this proposals are in some cases specific to some database technology like SQL Server, PostgreSQL or similar but the important for you is the idea and the approach. Most of these things can be done regardless of the database you use.

In general:

Is it a sin to break the 'one database per service' or is it okay to consider a service split in writing (to database) and reading/presenting (from same database) as one service.

I would say like for every rule there are exceptions and if using CQRS with 2 db's makes your life hard and you have problems working with your system or domain because of it then it means that you are not using the pattern/practices correctly or you are using the wrong pattern for your case. Keep in mind that patterns are there to solve common problems and if they don't for a particular case don't use them. In a sense of micro-services the things get even more complicated because a lot of things are adapted to fit your business needs. This is good as the goal is to provide the best possible solution for the customer. Even if you and your team find for yourself that you can go with 2 micro-services and using 1 database as the best solution, go for it. Don't make it a rule for your whole architecture because it is not practices in micro-services world but as always if you have good arguments for it you can break the rule.

Writing to the database needs to significantly scale more than reading the database.

This is not a problem at all from server/back-end point of view as you can horizontally scale and have as much instances of this micro-service running as you need. For example having 10 of them running at the same time just to serve the writes is fine. 10 reads in addition to this per Minute is in this scale nothing to worry about(in sense where all is in one micro-service the reads and the writes). When it comes to the point scaling the database. This is another topic. That you separate the reads to a dedicated database will not help scaling the data that you have in the Write database. In order to solve this problem there are other things to consider like: query optimizing, adding proper Indexes, data Sharding, data historization of and so on. But this is another topic.

Summary:

My suggestion would be to go with the 1. Option but only if you use SQL Server. On the other hand if you find that your current database technology provides a comparable feature then you can implement it with it as well. If that does not work for you I would suggest to go with the 3. Option and abandon CQRS for this part(or domain) completely. It seems to me that you don't need it for this case.