Search code examples
amazon-ec2dnsrabbitmqkuberneteshostname

Changing hostname breaks Rabbitmq when running on Kubernetes


I'm trying to run Rabbitmq using Kubernetes on AWS. I'm using the official Rabbitmq docker container. Each time the pod restarts the rabbitmq container gets a new hostname. I've setup a service (of type LoadBalancer) for the pod with a resolvable DNS name.

But when I use an EBS to make the rabbit config/messsage/queues persistent between restarts it breaks with:

exception exit: {{failed_to_cluster_with,
                     ['rabbitmq@rabbitmq-deployment-2901855891-nord3'],
                     "Mnesia could not connect to any nodes."},
                 {rabbit,start,[normal,[]]}}
  in function  application_master:init/4 (application_master.erl, line 134)

rabbitmq-deployment-2901855891-nord3 is the previous hostname rabbitmq container. It is almost like Mnesia saved the old hostname :-/

The container's info looks like this:

              Starting broker...
=INFO REPORT==== 25-Apr-2016::12:42:42 ===
node           : rabbitmq@rabbitmq-deployment-2770204827-cboj8
home dir       : /var/lib/rabbitmq
config file(s) : /etc/rabbitmq/rabbitmq.config
cookie hash    : XXXXXXXXXXXXXXXX
log            : tty
sasl log       : tty
database dir   : /var/lib/rabbitmq/mnesia/rabbitmq

I'm only able to set the first part of the node name to rabbitmq using the RABBITMQ_NODENAME environment variable.

Setting RABBITMQ_NODENAME to a resolvable DNS name breaks with:

Can't set short node name!\nPlease check your configuration\n"

Setting RABBITMQ_USE_LONGNAME to true breaks with:

Can't set long node name!\nPlease check your configuration\n"

Update:

  • Setting RABBITMQ_NODENAME to rabbitmq@localhost works but that negates any possibility to cluster instances.

              Starting broker...
    =INFO REPORT==== 26-Apr-2016::11:53:19 ===
    node           : rabbitmq@localhost
    home dir       : /var/lib/rabbitmq
    config file(s) : /etc/rabbitmq/rabbitmq.config
    cookie hash    : 9WtXr5XgK4KXE/soTc6Lag==
    log            : tty
    sasl log       : tty
    database dir   : /var/lib/rabbitmq/mnesia/rabbitmq@localhost
    
  • Setting RABBITMQ_NODENAME to the service name, in this case rabbitmq-service like so rabbitmq@rabbitmq-service also works since kubernetes service names are internally resolvable via DNS.

              Starting broker...
    =INFO REPORT==== 26-Apr-2016::11:53:19 ===
    node           : rabbitmq@rabbitmq-service
    home dir       : /var/lib/rabbitmq
    config file(s) : /etc/rabbitmq/rabbitmq.config
    cookie hash    : 9WtXr5XgK4KXE/soTc6Lag==
    log            : tty
    sasl log       : tty
    database dir   : /var/lib/rabbitmq/mnesia/rabbitmq@rabbitmq-service
    

Is this the right way though? Will I still be able to cluster multiple instances if the node names are the same?


Solution

  • The idea is to use a different 'service' and 'deployment' for each of the node you want to create.

    As you said, you have to create a custom NODENAME for each i.e:

    RABBITMQ_NODENAME=rabbit@rabbitmq-1
    

    Also rabbitmq-1,rabbitmq-2,rabbitmq-3 have to be resolved from each nodes. For that you can use kubedns. The /etc/resolv.conf will look like:

    search rmq.svc.cluster.local 
    

    and /etc/hosts must contains:

    127.0.0.1 rabbitmq-1  # or rabbitmq-2 on node 2...
    

    The services are here to create a stable network identity for each nodes

    rabbitmq-1.svc.cluster.local
    rabbitmq-2.svc.cluster.local
    rabbitmq-3.svc.cluster.local
    

    The different deployments resources will allow you to mount a different volume on each node.

    I'm working on a deployment tool to simplify those actions: I've done a demo on how I scale and deploy rabbitmq from 1 to 3 nodes on kubernetes: https://asciinema.org/a/2ktj7kr2d2m3w25xrpz7mjkbu?speed=1.5

    More generally, the complexity your facing to deploy a clustered application is addressed in the 'petset proposal': https://github.com/kubernetes/kubernetes/pull/18016