Search code examples
scalaplayframeworkplayframework-2.0couchdbjobs

Service with background jobs, how to ensure jobs only run periodically ONCE per cluster


I have a play framework based service that is stateless and intended to be deployed across many machines for horizontal scaling.

This service is handling HTTP JSON requests and responses, and is using CouchDB as its data store again for maximum scalability.

We have a small number of background jobs that need to be run every X seconds across the whole cluster. It is vital that the jobs do not execute concurrently on each machine.

To execute the jobs we're using Actors and the Akka Scheduler (since we're using Scala):

Akka.system().scheduler.schedule(
    Duration.create(0, TimeUnit.MILLISECONDS),
        Duration.create(10, TimeUnit.SECONDS),
        Akka.system().actorOf(LoggingJob.props),
        "tick")

(etc)

object LoggingJob {
    def props = Props[LoggingJob]
}

class LoggingJob extends UntypedActor {
    override def onReceive(message: Any) {
        Logger.info("Job executed! " + message.toString())
    }
}

Is there:

  • any built in trickery in Akka/Actors/Play that I've missed that will do this for me?
  • OR a recognised algorithm that I can put on top of Couchbase (distributed mutex? not quite?) to do this?

I do not want to make any of the instances 'special' as it needs to be very simple to deploy and manage.


Solution

  • Check out Akka's Cluster Singleton Pattern.

    For some use cases it is convenient and sometimes also mandatory to ensure that you have exactly one actor of a certain type running somewhere in the cluster.

    Some examples:

    • single point of responsibility for certain cluster-wide consistent decisions, or coordination of actions across the cluster system
    • single entry point to an external system
    • single master, many workers
    • centralized naming service, or routing logic

    Using a singleton should not be the first design choice. It has several drawbacks, such as single-point of bottleneck. Single-point of failure is also a relevant concern, but for some cases this feature takes care of that by making sure that another singleton instance will eventually be started.