I'm looking for a way to add new members to existing Aeron cluster without reconfiguring existing ones.
It seems cluster members are defined statically during startup as described in the Cluster Tutorial:
final ConsensusModule.Context consensusModuleContext = new ConsensusModule.Context()
.errorHandler(errorHandler("Consensus Module"))
.clusterMemberId(nodeId)
.clusterMembers(clusterMembers(Arrays.asList(hostnames))) // <------ HERE
.clusterDir(new File(baseDir, "consensus-module"))
.ingressChannel("aeron:udp?term-length=64k")
.logChannel(logControlChannel(nodeId, hostname, LOG_CONTROL_PORT_OFFSET))
.replicationChannel(logReplicationChannel(hostname))
.archiveContext(aeronArchiveContext.clone());
If I understand this correctly, if I want to add more nodes, I need to reconfigure each existing node to include the new member.
Moreover, I found this in Aeron Cookbook (emphasis mine)
Key aspects of Raft:
- there is a Strong Leader, which means that all log entries flow from the leader to followers
- Raft makes use of randomized timers to elect leaders. This adds a few milliseconds to failover, but reduces the time to agree an elected leader (in Aeron Cluster, this is a maximum of the election timeout * 2).
- the Raft protocol allows runtime configuration changes (i.e. adding new or removing nodes at runtime). At the time of writing, this feature is still pending in Aeron Cluster.
However, I do see classes like io.aeron.cluster.DynamicJoin
and its usage in io.aeron.cluster.ConsensusModuleAgent
which makes me think that adding nodes dynamically is possible and perhaps the cookbook is outdated.
Do you know a way to join more nodes without touching existing ones?
Yes, it is possible! The context should be built like this:
ConsensusModule.Context()
.errorHandler(errorHandler("Consensus Module"))
.clusterMemberId(Aeron.NULL_VALUE) // <1>
.clusterMembers("") // <2>
.memberEndpoints(memberEndpoints(hostnames[nodeId], nodeId)) // <3>
.clusterConsensusEndpoints(consensusEndpoints(hostnames)) // <4>
.clusterDir(File(baseDir, "consensus-module"))
.ingressChannel("aeron:udp?term-length=64k")
.logChannel("aeron:udp?term-length=64k")
.replicationChannel(logReplicationChannel(hostname))
.archiveContext(aeronArchiveContext.clone())
clusterMemberId
must be set to Aeron.NULL_VALUE
. The member ID will be generated automaticallyclusterMembers
should be empty. Static members are not required for a dynamic nodememberEndpoints
is the channel configuration of this node. The format is ingress:port,consensus:port,log:port,catchup:port,archive:port
. Very similar to static clusterMembers
configuration for a single node but without member ID infront.clusterConsensusEndpoints
is the comma-separated list consensus:port channels of known cluster members. I think of it similar to "bootstrap" list of hosts to join.