Search code examples
mariadbsystemdslurm

Slurmctld: error: mysql_real_connect failed: 1045 Access denied for user 'root'@'localhost' (using password: NO)


I cannot resolve the following issue:

root@MyCluster:/opt/WorkLoadManager/slurm/23.11.5# systemctl status slurmctld
× slurmctld.service - Slurm controller daemon
     Loaded: loaded (/etc/systemd/system/slurmctld.service; enabled; vendor preset: enabled)
     Active: failed (Result: exit-code) since Tue 2024-04-09 16:18:33 CEST; 4s ago
    Process: 2592430 ExecStart=/opt/WorkLoadManager/slurm/23.11.5/sbin/slurmctld --systemd $SLURMCTLD_OPTIONS (code=exited, status=1/FAILURE)
   Main PID: 2592430 (code=exited, status=1/FAILURE)
        CPU: 44ms

Apr 09 16:18:32 MyCluster.num.lab slurmctld[2592430]: slurmctld: Job accounting information stored, but details not gathered
Apr 09 16:18:32 MyCluster.num.lab slurmctld[2592430]: slurmctld: slurmctld version 23.11.5 started on cluster MyCluster
Apr 09 16:18:32 MyCluster.num.lab systemd[1]: Started Slurm controller daemon.
Apr 09 16:18:32 MyCluster.num.lab slurmctld[2592430]: slurmctld: accounting_storage/slurmdbd: clusteracct_storage_p_register_ctld: Registering slurmctld at port 6817 with slurmdbd
Apr 09 16:18:33 MyCluster.num.lab slurmctld[2592430]: slurmctld: priority/multifactor: _read_last_decay_ran: No last decay (/var/spool/slurmctld/priority_last_decay_ran) to recov>
Apr 09 16:18:33 MyCluster.num.lab slurmctld[2592430]: slurmctld: No memory enforcing mechanism configured.
Apr 09 16:18:33 MyCluster.num.lab slurmctld[2592430]: slurmctld: error: mysql_real_connect failed: 1045 Access denied for user 'root'@'localhost' (using password: NO)
Apr 09 16:18:33 MyCluster.num.lab slurmctld[2592430]: slurmctld: fatal: You haven't inited this storage yet.
Apr 09 16:18:33 MyCluster.num.lab systemd[1]: slurmctld.service: Main process exited, code=exited, status=1/FAILURE
Apr 09 16:18:33 MyCluster.num.lab systemd[1]: slurmctld.service: Failed with result 'exit-code'.

I am using mariadb Ver 15.1 Distrib 10.6.16-MariaDB, for debian-linux-gnu (x86_64) using EditLine wrapper and I previously tried the following fix:

MariaDB [(none)]> ALTER USER 'root'@'localhost' IDENTIFIED BY 'PASSWD';
MariaDB [(none)]> flush privileges;
MariaDB [(none)]> USE mysql;
MariaDB [mysql]> SELECT User, Host, plugin FROM mysql.user;
+-------------+-----------+-----------------------+
| User        | Host      | plugin                |
+-------------+-----------+-----------------------+
| mariadb.sys | localhost | mysql_native_password |
| root        | localhost | mysql_native_password |
| mysql       | localhost | mysql_native_password |
| slurm       | localhost | mysql_native_password |
| slurm       | system0   | mysql_native_password |
+-------------+-----------+-----------------------+
quit
#systemctl restart mariadb

But then I got slurmctld: error: mysql_real_connect failed: 1698 Access denied for user 'root'@'localhost' .

I am expecting slurmctld.service to be active as the other service slurmd and slurmdbd.


Solution

  • The controller seems to be configured to use the slurmdbd service for accounting (accounting_storage/slurmdbd) yet it tries to access the MySQL database directly. I therefore guess the jobcomp/mysql plugin is active too. The error message seems to indicate that JobCompPass is not set.

    You could therefore either

    • deactivate the jobcomp/mysql plugin as it is redundant with the accounting_storage/slurmdbd plugin; or
    • set JobCompPass to the password of the root MySQL user