Search code examples
microservicespublish-subscribegoogle-cloud-pubsub

Micro services and PubSub: how to make sure a service consumes the right events


I have a microservices architecture for an app. I have a service S1 that can fetch data and store it somewhere. Then 2 services, S2 and S3, both running very different ML tasks. When S2 or S3 need data, they publish a message to the PubSub on a topic called fetch_request. The service that fetches data continiously pulls from this topic, fetches the data. When it finishes a task, it publishes a message on a topic called "fetch_done" to let the service that made the request know that the data has been fetched.

My question is: how to make sure S2 and S3 don't consume messages from "fetch_done" that they were not supposed to consume ? I thought about solutions but I'm not sure:

  • may be when trying to pull from "fetch_done" I could add a filter to only pull message if they contain a UUID that I would have written in the initial request message ? This way, you can only pull a message if you know the ID. Of course the service that fetches data will then need to put the id in the response.

  • Kinda the same idea, but may be just add the name of the requesting service in the initial request instead of an id ? The problem with this one is that eventually, a service could impersonate another service if I'm correct, and as I probably won't be the sole developer of every service in the app, I think the UUID is a better idea.

  • Something obvious I completely missed ?


Solution

  • There is several patterns that you can implement. IMO it's better to have 1 subscription per service. Like that, the S1 post a message in a topic and the message is duplicated in each subscription.

    Now, you can:

    • Put an attributes in the message send by S2/S3 and received by S1. According to this attribute, S1 post in the topic for S2 or in the topic for S3. I don't really like this solution because there is too much responsibility on the S1 service side
    • Put an attributes in the message send by S2/S3 and received by S1. This time, the S1 simply mirror the attribute in the message sent in response in the only one topic. When you create the S2 and S3 subscription, you add a filter to received only the message with the corresponding attribute. Like this, for example, only the message for S2 are filtered and delivered in the S2 subscription.
    • Put an attributes in the message send by S2/S3 and received by S1. This information is mirrored in the body or in the attribute of the messages by the S1 service when it's posted. No filter on the PubSub subscription, the S2 and S3 service received all the message and checks on their side if the message is for them (and continue the processing) or not (and trash the message -> Ack correctly but do nothing).

    In any case, because the communication is async you need a element to distinguish the correct receiver of the messages. The 2nd solution, with the pubsub filter is the most efficient (I think).