Search code examples
architecturemessage-queuemicroservicesdistributed-system

Reducing unnecessary work for a multiple-instance service


I have a little service which connects to a third party web-service, obtains some information and saves it into a mongo collection. The data this service is interested in is pretty static, but it it can change under exceptional circumstances (it is football schedules, btw). To get notified about changes, the service checks back every 3-6 hours to see if any matches have been cancelled or rescheduled. New entries end up in the database, old ones are discarded (since they are already in the collection).

The service also exposes a GET endpoint, to which users connect.

Now, this is fine when I run a single instance of the service, but not so nice when I have multiple instances (probably it does not make sense for all instances to query the data service every three hours and discard most of the result).

I have the following ideas how to solve this:

  • Use some kind of leader election algorithm, only the leader should query the third party service
  • Separate the service into two: one smaller service would query the data (still problematic with several instances), put the result on a message queue so it's guaranteed that only one consumer takes and processes that result
  • Combine the first two ideas: leader election for the querying service, message queues for consuming data
  • Use some kind of distributed lock (I am aware of a solution with Redis/Jedis) so only one service does the querying. This, however feels a bit of an overkill; adding Redis just for locking is like...meh...
  • A much better, other idea, commonly used in such cases :-)

Could you please let me know if there is a preferred solution to such problems?


Solution

  • I would keep things easy and avoid overcomplexity. Just persist the WS response time and each instance before calling again the WS, should check on DB how much time passed since last call.