Search code examples
node.jsarchitecturemicroservices

What's the right way to gather data from different microservices?


I'm having a problem understanding how basic communication between microservices should be made and I haven't been able to find a good solution or standard way to do this in the other questions. Let's use this basic example.

enter image description here

I have an invoice service that return invoices, every invoice will contain information(ids) about the user and the products. If I have a view in which I need to render the invoices for a specific user, I just make a simple request.

let url = "http://my-domain.com/api/v2/invoices"
let params = {userId:1}
request(url,params,(e,r)=>{
  const results = r // An array of 1000 invoices for the user 1
});

Now, for this specific view I will need to make another request to get all the details for each product on each invoice.

results.map((invoice)=>{
   invoice.items.map((itemId)=>{
      const url=`http://my-domain.com/api/v2/products/${itemId}`
      request(url,(e,r)=>{
       const product = r
       //Do something else.....
      });
   });
});

I know the code example is not perfect but you can see that this will generate a huge number of requests(at least 1000) to the product service and just for 1 user, now imagine if I have 1000 users making this kind of requests.

What is the right way to get the information off all the products without having to make this number of requests in order to avoid performance issues?.

I found some workarounds for this kind of scenarios such as:

  1. Create an API endpoint that accepts a list of IDs in order to make a single request.
  2. Duplicate the information from the Product service within the invoice service and find a way to keep them in sync.

In a microservices architecture are these the right ways to deal with this kind of issues? For me, they look like simple workarounds.

Edit #1: Based on Remus Rusanu response.

As per Remus recommendation, I decided to isolate my services and describe them a little bit better.

enter image description here

As shown in the image above the microservices are now isolated(in specific the Billing-service) and they now are the owners of the data. By using this structure I ensure that Billing-service is able to work even if there are async jobs or even if the other two services are down.

If I need to create a new invoice, I can call the other two microservices(Users, Inventory) synchronously and then update the data on the "cache" tables(Users, Inventory) in my billing service.

Is it also good to assume these "cache" tables are read-only? I assume they are since only the user/inventory services should be able to modify this information to preserve isolation and authority over the information.


Solution

  • You need to isolate the services as so they do not share state/data. The design in your question is a single macroservice split into 3 correlated storage silos. Case in point, you cannot interpret a result form the 'Invoicing' service w/o correlating the data with the 'Products' response(s).

    Isolated microservices mean they own their data and they can operate independently. An invoice is complete as returned from the 'Invoices' service. It contains the product names, the customer name, every information on the invoice. All the data came from its own storage. A separate microservice could be 'Inventory', that operates all the product inventories, current stock etc. It would also have its own data, in its own storage. A 'product' can exist in both storage mediums, and there once was logical link between them (when the invoice was created), but the link is severed now. The 'Inventory' microservice can change its products (eg. remove one, add new SKUs etc) w/o affecting the existing Invoices (this is not only a microservice isolation requirement, is also a basic accounting requirement). I'm not going to enter here into details of what is a product 'identity' in real life.

    If you find yourself asking questions like you're asking it likely means you do not have microservices. You should think at your microservice boundaries while considering what happens if you replace all communication with async queued based requests (a response can come 6 days later): If the semantics break, the boundary is probably wrong. If the semantics hold, is the right track.