I managed (with some help from here) to setup a replication from a MASTER server running mysql 5.6 (centos 6) to a slave running Mariadb 10.1.22 (Centos 7).
My issue now is this, i have another server with the exact mariadb version and specs but its replication is not catching up, instead it is increasing.
When started it was 48000 seconds behind and quickly dropped to 46000 after a few minutes. After that it is steadily increasing. ATM of writing almost back to 48K seconds
Show full processlist;
shows the sql thread is spending up to 8 seconds running Update_rows_log_event::ha_update_row(-1)
back to back which from all the google search i cannot find what it means.
MariaDB [(none)]> show full processlist;
+-----+------------------+---------------------------------------+--------------+---------+------+------------------------------------------+-----------------------+----------+
| Id | User | Host | db | Command | Time | State | Info | Progress |
+-----+------------------+---------------------------------------+--------------+---------+------+------------------------------------------+-----------------------+----------+
| 3 | system user | | NULL | Connect | 3640 | Queueing master event to the relay log | NULL | 0.000 |
| 2 | system user | | NULL | Connect | 5 | Update_rows_log_event::ha_update_row(-1) | NULL | 0.000 |
Also i caught a simple UPDATE table SET timestamp = NOW() WHERE static_ip = 'a-valid-ip' AND process_id = '13217'
taking up to 6 seconds while the table has the static_ip and process_id columns as PK and the command takes 0.078 seconds when executed directly.
Contents of /etc/my.cnf
[mysqld]
max_allowed_packet = 1G
max_connections = 600
thread_cache_size = 16
query_cache_size = 64M
tmp_table_size= 512M
max_heap_table_size= 512M
wait_timeout=60
#Innodb Settings
innodb_file_per_table=1
innodb_buffer_pool_size = 25G
innodb_log_file_size = 2048M
innodb_flush_log_at_trx_commit = 0
innodb_file_format = Barracuda
innodb_flush_neighbors = 0
#Log
log-error =/var/log/error.log
tmpdir = /dev/shm
#Replication SLAVE
server-id=6
slave-skip-errors=1062
my.cnf is same as the server that is running OK except for the slave-id.
Any suggestions/help on what is happening?
Thank you.
From help from the guys at mariadb the ha_update_rows was not relevant and the reason for the slowness was dual disk failure on the machine.
[root@ser3 ~]# dd if=/dev/zero of=/tmp/output conv=fdatasync bs=384k count=1k;
1024+0 records in
1024+0 records out
402653184 bytes (403 MB) copied, 43.1096 s, 9.3 MB/s
This is an SSD.