Search code examples
mariadbgalera

Unable to create MariaDB Galera Cluster


I have built an image based on mariadb:10.1 which basically adds a new cluster.conf but facing the following error on the second node after the first node started working successfully. Can somebody help me debug here?

Error log tail

2016-09-28 10:12:55 139799503415232 [ERROR] WSREP: failed to open gcomm backend connection: 110: failed to reach primary view: 110 (Connection timed out)
     at gcomm/src/pc.cpp:connect():162
2016-09-28 10:12:55 139799503415232 [ERROR] WSREP: gcs/src/gcs_core.cpp:gcs_core_open():208: Failed to open backend connection: -110 (Connection timed out)
2016-09-28 10:12:55 139799503415232 [ERROR] WSREP: gcs/src/gcs.cpp:gcs_open():1380: Failed to open channel 'test_cluster' at 'gcomm://172.17.0.2,172.17.0.3,172.17.0.4': -110 (Connection timed out)
2016-09-28 10:12:55 139799503415232 [ERROR] WSREP: gcs connect failed: Connection timed out
2016-09-28 10:12:55 139799503415232 [ERROR] WSREP: wsrep::connect(gcomm://172.17.0.2,172.17.0.3,172.17.0.4) failed: 7
2016-09-28 10:12:55 139799503415232 [ERROR] Aborting

MySQL init process failed.

Debugging steps taken

NOTE: Container IP addresses were ensured to be the same as shown.

  1. To ensure networking between containers is working, tried creating another container which could login to the first container's mysql instance.
  2. This is definitely not related to MYSQL_HOST
  3. To see if the container was running out of memory, I used docker stats and saw that the failed container was using only a meagre 142MB all through its lifecycle until it failed, which is way lesser than the total memory it was allowed (~4GB).
  4. I am using Docker for Mac, but tried running the same on a CentOS VirtualBox and gives the same results. Doesn't look like Docker on Mac has a problem.

Config

[mysqld]
user=mysql
binlog_format=ROW
bind-address=0.0.0.0
default_storage_engine=innodb
innodb_autoinc_lock_mode=2
innodb_flush_log_at_trx_commit=0
innodb_buffer_pool_size=122M
innodb_file_per_table=1
innodb_doublewrite=1
query_cache_size=0
query_cache_type=0
wsrep_on=ON
wsrep_provider=/usr/lib/libgalera_smm.so
wsrep_sst_method=rsync

Steps to start containers

# bootstrap node
docker run --rm -e MYSQL_ROOT_PASSWORD=123 \ 
  activatedgeek/mariadb:devel \
    --wsrep-cluster-name=test_cluster \
    --wsrep-cluster-address=gcomm://172.17.0.2,172.17.0.3,172.17.0.4 \
    --wsrep-new-cluster

# add node into cluster
docker run --rm -e MYSQL_ROOT_PASSWORD=123 \ 
  activatedgeek/mariadb:devel \
    --wsrep-cluster-name=test_cluster \
    --wsrep-cluster-address=gcomm://172.17.0.2,172.17.0.3,172.17.0.4

# add node into cluster
docker run --rm -e MYSQL_ROOT_PASSWORD=123 \ 
  activatedgeek/mariadb:devel \
    --wsrep-cluster-name=test_cluster \
    --wsrep-cluster-address=gcomm://172.17.0.2,172.17.0.3,172.17.0.4

Solution

  • This problem is caused due to the hanging init process. The configurations and CLI arguments above are correct. The only thing to be done before the init process starts is to create and empty mysql directory in the data directory (/var/lib/mysql by default). The must only be created on all nodes except the bootstrap node.

    mkdir -p /var/lib/mysql/mysql
    

    See sample MariaDB Cluster for usage which uses a custom MariaDB image and is a proof of concept for creating clusters.