java cluster-computing replication hazelcast

Ensure replication between data centres with Hazelcast

I have an application incorporating a stretched Hazelcast cluster deployed on 2 data centres simultaneously. The 2 data centres are usually both fully functional, but, at times, one of them is taken completely out of the network for SDN upgrades.

What I intend to achieve is to configure the cluster in such a way that each main partition from a DC will have at least 2 backups - one in the other cluster and one in the current one.

For this purpose, checking the documentation pointed me toward the direction of partition groups(http://docs.hazelcast.org/docs/2.3/manual/html/ch12s03.html). Enterprise WAN Replication seemed exactly like the thing we wanted, but, unfortunately, this feature is not available for the free version of Hazelcast.

My configuration is as follows:

    NetworkConfig network = config.getNetworkConfig();

    network.setPort(hzClusterConfigs.getPort());
    JoinConfig join = network.getJoin();
    join.getMulticastConfig().setEnabled(hzClusterConfigs.isMulticastEnabled());
    join.getTcpIpConfig()
            .setMembers(hzClusterConfigs.getClusterMembers())
            .setEnabled(hzClusterConfigs.isTcpIpEnabled());
    config.setNetworkConfig(network);

    PartitionGroupConfig partitionGroupConfig = config.getPartitionGroupConfig()
            .setEnabled(true).setGroupType(PartitionGroupConfig.MemberGroupType.CUSTOM)
            .addMemberGroupConfig(new MemberGroupConfig().addInterface(hzClusterConfigs.getClusterDc1Interface()))
            .addMemberGroupConfig(new MemberGroupConfig().addInterface(hzClusterConfigs.getClusterDc2Interface()));
    config.setPartitionGroupConfig(partitionGroupConfig);

The configs used initially were:

clusterMembers=host1,host2,host3,host4
clusterDc1Interface=10.10.1.*
clusterDc2Interface=10.10.1.*

However, with this set of configs at any event triggered when changing the components of the cluster, a random node in the cluster started logging "No member group is available to assign partition ownership" every other second (as here: https://github.com/hazelcast/hazelcast/issues/5666). What is more, checking the state exposed by the PartitionService in JMX revealed that no partitions were actually getting populated, despite the apparently successful cluster state.

As such, I proceeded to replacing hostnames with the corresponding IPs and the configuration worked. The partitions were getting created successfully and no nodes were acting out.

The problem here is that the boxes are created as part of an A/B deployment process and get their IPs automatically from a range of 244 IPs. Adding all those 244 IPs seems like a bit much, even if it would be done programatically from Chef and not manually, because of all the network noise it would entail. Checking at every deployment using a telnet-based client which machines are listening on the hazelcast port also seems a bit problematic, since the IPs will be different from a deployment to another and we would get ourselves into a situation in which a part of the nodes in the cluster will have a certain member list and another part will have a different member list at the same time.

Using hostnames would be the best solution, in my opinion, because we would rely on DNS resolution and wouldn't need to wrap our heads around IP resolution at provisioning time.

Does anyone know of a workaround for the group config issue? Or, perhaps, an alternative to achieve the same behavior?

Solution

This is not possible at the moment. Backup groups cannot be designed the way to have a backup of themselves. As a workaround you might be able to design 4 groups but in this case there is no guarantee that one backup will be on each datacenter, at least not without using 3 backups.

Anyhow in general we do not recommend to spread a Hazelcast cluster over multiple datacenters, except for the very specific situation where the DCs are interconnected in a way that it is similar to a LAN network and redundancy is set up.