Search code examples
apache-karaffailoverapache-servicemixmaster-slave

Single install Apache Karaf with failover configuration using shared disk


I'm looking to implement failover (master/slave) for Karaf. Our current server setup has two application servers that have a shared SAN disk where our current Java applications are installed in a single location and can be started on either machine or both machines at the same time.

I was looking to implement Karaf master/slave failover in a similar way (one install being shared by both app servers), however I'm not sure that this is really a well beaten path and would appreciate some advice on whether the alternatives (mentioned below) are significantly better.

Current idea for failover: Install Karaf once on the shared SAN and setup basic file locking on this shared disk. Both application servers will effectively initiate the Karaf start script, however only one (the first) will fully start (grabbing the lock) and the second will remain in standby until it grabs the lock (if the master falls over)

The main benefit I can see from this is that I only have to manage deploying components to one Karaf installation and I only need to manage one Karaf installation.

Alternatives: We install Karaf in two separate locations on the shared SAN and setup to lock to the same lock file. Each application server will have their own Karaf instance, thus start script to run.

This will make our deployment slightly more complicated (2 Karaf installations to manage and deploy to).

I'd be interested if anyone can indicate any specific concerns that they have with the current idea.

Note: I understand that Karaf-cellar can simplify my Karaf instance management, however we would need to undertake another round of PoCs etc.. to approve our company use of cellar (as a separate product). Something that I'd like to migrate to in the future.


Solution

  • Take a look at the documentation

    This is from the documentation on how to set a lockfile for HA:

    karaf.lock=true
    karaf.lock.class=org.apache.karaf.main.lock.SimpleFileLock
    karaf.lock.dir=<PathToLockFileDirectory>
    karaf.lock.delay=10000
    

    as can be seen there, you can also set a level for the bundle start-levels to start or not to start:

    karaf.lock.level=50