I cannot resolve the following issue:
root@MyCluster:/opt/WorkLoadManager/slurm/23.11.5# systemctl status slurmctld
× slurmctld.service - Slurm controller daemon
Loaded: loaded (/etc/systemd/system/slurmctld.service; enabled; vendor preset: enabled)
Active: failed (Result: exit-code) since Tue 2024-04-09 16:18:33 CEST; 4s ago
Process: 2592430 ExecStart=/opt/WorkLoadManager/slurm/23.11.5/sbin/slurmctld --systemd $SLURMCTLD_OPTIONS (code=exited, status=1/FAILURE)
Main PID: 2592430 (code=exited, status=1/FAILURE)
CPU: 44ms
Apr 09 16:18:32 MyCluster.num.lab slurmctld[2592430]: slurmctld: Job accounting information stored, but details not gathered
Apr 09 16:18:32 MyCluster.num.lab slurmctld[2592430]: slurmctld: slurmctld version 23.11.5 started on cluster MyCluster
Apr 09 16:18:32 MyCluster.num.lab systemd[1]: Started Slurm controller daemon.
Apr 09 16:18:32 MyCluster.num.lab slurmctld[2592430]: slurmctld: accounting_storage/slurmdbd: clusteracct_storage_p_register_ctld: Registering slurmctld at port 6817 with slurmdbd
Apr 09 16:18:33 MyCluster.num.lab slurmctld[2592430]: slurmctld: priority/multifactor: _read_last_decay_ran: No last decay (/var/spool/slurmctld/priority_last_decay_ran) to recov>
Apr 09 16:18:33 MyCluster.num.lab slurmctld[2592430]: slurmctld: No memory enforcing mechanism configured.
Apr 09 16:18:33 MyCluster.num.lab slurmctld[2592430]: slurmctld: error: mysql_real_connect failed: 1045 Access denied for user 'root'@'localhost' (using password: NO)
Apr 09 16:18:33 MyCluster.num.lab slurmctld[2592430]: slurmctld: fatal: You haven't inited this storage yet.
Apr 09 16:18:33 MyCluster.num.lab systemd[1]: slurmctld.service: Main process exited, code=exited, status=1/FAILURE
Apr 09 16:18:33 MyCluster.num.lab systemd[1]: slurmctld.service: Failed with result 'exit-code'.
I am using mariadb Ver 15.1 Distrib 10.6.16-MariaDB, for debian-linux-gnu (x86_64) using EditLine wrapper and I previously tried the following fix:
MariaDB [(none)]> ALTER USER 'root'@'localhost' IDENTIFIED BY 'PASSWD';
MariaDB [(none)]> flush privileges;
MariaDB [(none)]> USE mysql;
MariaDB [mysql]> SELECT User, Host, plugin FROM mysql.user;
+-------------+-----------+-----------------------+
| User | Host | plugin |
+-------------+-----------+-----------------------+
| mariadb.sys | localhost | mysql_native_password |
| root | localhost | mysql_native_password |
| mysql | localhost | mysql_native_password |
| slurm | localhost | mysql_native_password |
| slurm | system0 | mysql_native_password |
+-------------+-----------+-----------------------+
quit
#systemctl restart mariadb
But then I got slurmctld: error: mysql_real_connect failed: 1698 Access denied for user 'root'@'localhost'
.
I am expecting slurmctld.service to be active as the other service slurmd and slurmdbd.
The controller seems to be configured to use the slurmdbd
service for accounting (accounting_storage/slurmdbd
) yet it tries to access the MySQL database directly. I therefore guess the jobcomp/mysql
plugin is active too. The error message seems to indicate that JobCompPass
is not set.
You could therefore either
jobcomp/mysql
plugin as it is redundant with the accounting_storage/slurmdbd
plugin; orJobCompPass
to the password of the root
MySQL user