amazon-ec2 dns rabbitmq kubernetes hostname

Changing hostname breaks Rabbitmq when running on Kubernetes

I'm trying to run Rabbitmq using Kubernetes on AWS. I'm using the official Rabbitmq docker container. Each time the pod restarts the rabbitmq container gets a new hostname. I've setup a service (of type LoadBalancer) for the pod with a resolvable DNS name.

But when I use an EBS to make the rabbit config/messsage/queues persistent between restarts it breaks with:

exception exit: {{failed_to_cluster_with,
                     ['rabbitmq@rabbitmq-deployment-2901855891-nord3'],
                     "Mnesia could not connect to any nodes."},
                 {rabbit,start,[normal,[]]}}
  in function  application_master:init/4 (application_master.erl, line 134)

rabbitmq-deployment-2901855891-nord3 is the previous hostname rabbitmq container. It is almost like Mnesia saved the old hostname :-/

The container's info looks like this:

              Starting broker...
=INFO REPORT==== 25-Apr-2016::12:42:42 ===
node           : rabbitmq@rabbitmq-deployment-2770204827-cboj8
home dir       : /var/lib/rabbitmq
config file(s) : /etc/rabbitmq/rabbitmq.config
cookie hash    : XXXXXXXXXXXXXXXX
log            : tty
sasl log       : tty
database dir   : /var/lib/rabbitmq/mnesia/rabbitmq

I'm only able to set the first part of the node name to rabbitmq using the RABBITMQ_NODENAME environment variable.

Setting RABBITMQ_NODENAME to a resolvable DNS name breaks with:

Can't set short node name!\nPlease check your configuration\n"

Setting RABBITMQ_USE_LONGNAME to true breaks with:

Can't set long node name!\nPlease check your configuration\n"

Update:

Setting RABBITMQ_NODENAME to rabbitmq@localhost works but that negates any possibility to cluster instances.

          Starting broker...
=INFO REPORT==== 26-Apr-2016::11:53:19 ===
node           : rabbitmq@localhost
home dir       : /var/lib/rabbitmq
config file(s) : /etc/rabbitmq/rabbitmq.config
cookie hash    : 9WtXr5XgK4KXE/soTc6Lag==
log            : tty
sasl log       : tty
database dir   : /var/lib/rabbitmq/mnesia/rabbitmq@localhost

Setting RABBITMQ_NODENAME to the service name, in this case rabbitmq-service like so rabbitmq@rabbitmq-service also works since kubernetes service names are internally resolvable via DNS.

          Starting broker...
=INFO REPORT==== 26-Apr-2016::11:53:19 ===
node           : rabbitmq@rabbitmq-service
home dir       : /var/lib/rabbitmq
config file(s) : /etc/rabbitmq/rabbitmq.config
cookie hash    : 9WtXr5XgK4KXE/soTc6Lag==
log            : tty
sasl log       : tty
database dir   : /var/lib/rabbitmq/mnesia/rabbitmq@rabbitmq-service

Is this the right way though? Will I still be able to cluster multiple instances if the node names are the same?

Solution

The idea is to use a different 'service' and 'deployment' for each of the node you want to create.

As you said, you have to create a custom NODENAME for each i.e:

RABBITMQ_NODENAME=rabbit@rabbitmq-1

Also rabbitmq-1,rabbitmq-2,rabbitmq-3 have to be resolved from each nodes. For that you can use kubedns. The /etc/resolv.conf will look like:

search rmq.svc.cluster.local

and /etc/hosts must contains:

127.0.0.1 rabbitmq-1  # or rabbitmq-2 on node 2...

The services are here to create a stable network identity for each nodes

rabbitmq-1.svc.cluster.local
rabbitmq-2.svc.cluster.local
rabbitmq-3.svc.cluster.local

The different deployments resources will allow you to mount a different volume on each node.

I'm working on a deployment tool to simplify those actions: I've done a demo on how I scale and deploy rabbitmq from 1 to 3 nodes on kubernetes: https://asciinema.org/a/2ktj7kr2d2m3w25xrpz7mjkbu?speed=1.5

More generally, the complexity your facing to deploy a clustered application is addressed in the 'petset proposal': https://github.com/kubernetes/kubernetes/pull/18016