Search code examples
dnsrabbitmqrabbitmqctlepmd

RabbitMQ wont cluster (nxdomain)


I want to set up 2 rabbitmq servers to work in cluster. When when trying to run

rabbitmqctl join_cluster rabbit@my_rabbit_1.my.domain.name on my_rabbit_1

I get unable to connect to epmd (port 4369) on my_rabbit_2.my.domain.name: nxdomain (non-existing domain)

I use rabbitmq:latest (debian), .erlang.cookie is the same, hosts resolve fine: I can ping both directions, nmap -6 -p 4369 my_rabbit_2.my.domain.nam returns 4369/tcp open epmd

EDIT:

tcpdump shows that while resolving hostname, rabbit or epmd performs not 2 types of DNS query: AAAA for IPv6 and A for IPv4 address, but only IPv4 which fails repeatedly with nxdomain as there is no IPv4 address available. However, it does not try AAAA DNS query, except when trying to run command like rabbitmq -n rabbit@local.machine.domain.name: then it runs AAAA query and outputs successfully. Hence the problem. How do I solve that?


Solution

  • Finally found solution that worked for me. Erlang documentation says that, by default, -proto_dist specifies a protocol for Erlang distribution, which defaults to inet_tcp (TCP over IPv4). So in IPv6-only environment you have to set -proto_dist inet6_tcp flag for erl.

    This can be done by adding the following lines to your rabbitmq-env.conf (see RabbitMQ configuration docs):

    # For rabbitmq-server
    RABBITMQ_SERVER_ADDITIONAL_ERL_ARGS="-proto_dist inet6_tcp"
    # For rabbitmqctl
    RABBITMQ_CTL_ERL_ARGS="-proto_dist inet6_tcp"
    

    Note that rabbitmqctl and rabbitmq-server use different erl settings: I was unable to create cluster without RABBITMQ_CTL_ERL_ARGS="-proto_dist inet6_tcp" setting using rabbitmqctl join_cluster rabbit@host.in.my.domain. It should not be necessary in production mode. Also note that RabbitMQ configuration docs advice against using this setting except for debugging.