Search code examples
apache-zookeepermesosmesosphere

Chronos Cluster with High Availability


I have three server A,B,C on each machine I'm running Chronos, ZooKeeper, mesos-master, mesos-slave.

Chronos contact mesos-master using ZooKeeper url hence it automatically picks leading master even if some node is down. I'm having high availability here.

Even Chronos run in cluster mode so accessing any of the Chronos I see same list of jobs and everything works fine.

Problem I have here is, Chronos is accessible with any of the three URLs

  • http://server_node_1:4400
  • http://server_node_2:4400
  • http://server_node_3:4400

I have another application which schedules jobs in Chronos using Rest API. Which URL my application has to talk to in order in run in high availabiity mode?

Let's say my application talks to http://server_node_1:4400 for scheduling the job, if Chronos on node server_node_1 is down I'm not able to schedule the Job.

My application needs to talk to single URL in order to schedule job in Chronos. Even if some Chronos node is down, I should be able to schedule the job. Do I need to have some kind of load balancer between my application and Chronos cluster to pick running chronos node for job scheduling? How can I achieve high availability in my scenario?


Solution

  • Use HAProxy for routing to a Chronos instance. This way you can access a Chronos instance using e.g. curl loadbalancer:8081.

    haproxy.cfg:

    listen chronos_8081
      bind 0.0.0.0:8081
      mode http
      balance roundrobin
      option  allbackups
      option http-no-delay
      server chronos01 server_node_1:4400
      server chronos02 server_node_2:4400
      server chronos03 server_node_3:4400
    

    Or even better, start Chronos via Marathon, which will ensure given number of instances. Then HAProxy configuration could be generated by: