I'm setting up an InnoDB Cluster using mysqlsh
. This is in Kubernetes, but I think this question applies more generally.
When I use cluster.configureInstance()
I see messages that includes:
This instance reports its own address as node-2:3306
However, the nodes can only find each other through DNS at an address like node-2.cluster:3306
. The problem comes when adding instances to the cluster; they try to find the other nodes without the qualified name. Errors are of the form:
[GCS] Error on opening a connection to peer node node-0:33061 when joining a group. My local port is: 33061.
It is using node-n:33061
rather than node-n.cluster:33061
.
If it matters, the "DNS" is set up as a headless service in Kubernetes that provides consistent addresses as pods come and go. It's very simple, and I named it "cluster" to created addresses of the form node-n.cluster
. I don't want to cloud this question with detail I don't think matters, however, as surely other configurations require the instances in the cluster to use DNS as well.
I thought that setting localAddress
when creating the cluster and adding the nodes would solve the problem. Indeed, after I added that to the createCluster
options, I can look in the database and see
| group_replication_local_address | node-0.cluster:33061 |
After I create the cluster and look at the topology, it seems that the local address setting has no effect whatsoever:
{
"clusterName": "mycluster",
"defaultReplicaSet": {
"name": "default",
"primary": "node-0:3306",
"ssl": "REQUIRED",
"status": "OK_NO_TOLERANCE",
"statusText": "Cluster is NOT tolerant to any failures.",
"topology": {
"node-0:3306": {
"address": "node-0:3306",
"memberRole": "PRIMARY",
"mode": "R/W",
"readReplicas": {},
"replicationLag": null,
"role": "HA",
"status": "ONLINE",
"version": "8.0.29"
}
},
"topologyMode": "Single-Primary"
},
"groupInformationSourceMember": "node-0:3306"
}
And adding more instances continues to fail with the same communication errors.
How do I convince each instance that the address it needs to advertise is different? I will try other permutations of the localAddress
setting, but it doesn't look like it's intended to fix the problem I'm having. How do I reconcile the address the instance reports for itself with the address that's actually useful for other instances to find it?
Edit to add: Maybe it is a Kubernetes thing? Or a Docker thing at any rate. There is an environment variable set in the container:
HOSTNAME=node-0
Does the containerized MySQL use that? If so, how do I override it?
Apparently this value has to be set at startup. The option for my setup was
--report-host=${HOSTNAME}.cluster
when starting the MySQL instances resolved the issue.
Specifically for Kubernetes, an example is at https://github.com/adamelliotfields/kubernetes/blob/master/mysql/mysql.yaml