Search code examples
apachenagiosopensuse

Web UI for Nagios says "Not Running" although it is


Been sorting through this on my own for a few days so looking for a pointer or two. I was running Nagios 4.4.6 and decided to upgrade to 4.4.9. I downloaded the source and built and installed the update. It all went smoothly. I checked that Nagios was running and it was. I then restarted Apache.

The UI for Nagios was up and running, but the Version it was displaying was wrong, it still showed 4.4.6. I went back and checked the running version of Nagios from the CLI and it still showed, 4.4.6. I check the location of the executable and it showed the old date.
So, I located the newly built Nagios core executable and copied it to the /usr/sbin location. I then restarted Nagios and it showed 4.4.9 so that was good. I restarted Apache and it now shows the correct version of Nagios but the status is "Not Running" even though the CLI shows that it is.

This SO post is very similar but wasn't the solution for me: Nagios is active on CLI but not running on Web Interface.

I have gone over the configuration multiple times but I am obviously missing something simple.
Can anyone see what I am missing?
I am on OpenSUSE Leap 15.3.

Nagios service status:

opensuse:/home/pete # systemctl status nagios
● nagios.service - Nagios Core 4.4.9
     Loaded: loaded (/usr/lib/systemd/system/nagios.service; enabled; vendor preset: disabled)
     Active: active (running) since Sat 2023-01-14 13:44:14 CST; 1 day 18h ago
       Docs: https://www.nagios.org/documentation
    Process: 1530 ExecStartPre=/usr/local/nagios/bin/nagios -v /etc/nagios/nagios.cfg (code=exited, status=0/SUCCESS)
    Process: 1540 ExecStart=/usr/local/nagios/bin/nagios -d /etc/nagios/nagios.cfg (code=exited, status=0/SUCCESS)
   Main PID: 1545 (nagios)
      Tasks: 16 (limit: 4915)
     CGroup: /system.slice/nagios.service
             ├─ 1545 /usr/local/nagios/bin/nagios -d /etc/nagios/nagios.cfg
             ├─ 1546 /usr/local/nagios/bin/nagios --worker /var/lib/nagios/nagios.qh
             ├─ 1547 /usr/local/nagios/bin/nagios --worker /var/lib/nagios/nagios.qh
             ├─ 1548 /usr/local/nagios/bin/nagios --worker /var/lib/nagios/nagios.qh
             ├─ 1549 /usr/local/nagios/bin/nagios --worker /var/lib/nagios/nagios.qh
             ├─ 1550 /usr/local/nagios/bin/nagios --worker /var/lib/nagios/nagios.qh
             ├─ 1551 /usr/local/nagios/bin/nagios --worker /var/lib/nagios/nagios.qh
             ├─ 1552 /usr/local/nagios/bin/nagios --worker /var/lib/nagios/nagios.qh
             ├─ 1553 /usr/local/nagios/bin/nagios --worker /var/lib/nagios/nagios.qh
             ├─ 1554 /usr/local/nagios/bin/nagios --worker /var/lib/nagios/nagios.qh
             ├─ 1555 /usr/local/nagios/bin/nagios --worker /var/lib/nagios/nagios.qh
             ├─ 1556 /usr/local/nagios/bin/nagios --worker /var/lib/nagios/nagios.qh
             ├─ 1557 /usr/local/nagios/bin/nagios --worker /var/lib/nagios/nagios.qh
             ├─ 1696 /usr/local/nagios/bin/nagios -d /etc/nagios/nagios.cfg
             ├─10894 /usr/local/nagios/lib/check_ping -H 127.0.0.1 -w 100.0,20% -c 500.0,60% -p 5
             └─10895 /usr/bin/ping -n -U -w 10 -c 5 127.0.0.1

Jan 16 00:00:00 opensuse nagios[1545]: CURRENT SERVICE STATE: localhost;Swap Usage;OK;HARD;1;SWAP OK - 100% free (2048 MB out of 2048 MB)
Jan 16 00:00:00 opensuse nagios[1545]: CURRENT SERVICE STATE: localhost;Total Processes;OK;HARD;1;PROCS OK: 80 processes with STATE = RSZDT
Jan 16 00:44:14 opensuse nagios[1545]: Auto-save of retention data completed successfully.

Nagios config test:

opensuse:/home/pete # /usr/local/nagios/bin/nagios -v /etc/nagios/nagios.cfg

Nagios Core 4.4.9
Copyright (c) 2009-present Nagios Core Development Team and Community Contributors
Copyright (c) 1999-2009 Ethan Galstad
Last Modified: 2022-11-16
License: GPL

Website: https://www.nagios.org
Reading configuration data...
   Read main config file okay...
   Read object config files okay...

Running pre-flight check on configuration data...

Checking objects...
        Checked 8 services.
        Checked 1 hosts.
        Checked 1 host groups.
        Checked 0 service groups.
        Checked 1 contacts.
        Checked 1 contact groups.
        Checked 24 commands.
        Checked 5 time periods.
        Checked 0 host escalations.
        Checked 0 service escalations.
Checking for circular paths...
        Checked 1 hosts
        Checked 0 service dependencies
        Checked 0 host dependencies
        Checked 5 timeperiods
Checking global event handlers...
Checking obsessive compulsive processor commands...
Checking misc settings...

Total Warnings: 0
Total Errors:   0

Things look okay - No serious problems were detected during the pre-flight check

Appache2 status:

opensuse:/home/pete # systemctl status apache2
● apache2.service - The Apache Webserver
     Loaded: loaded (/usr/lib/systemd/system/apache2.service; enabled; vendor preset: disabled)
     Active: active (running) since Sat 2023-01-14 13:44:14 CST; 1 day 18h ago
   Main PID: 1605 (httpd-prefork)
     Status: "Processing requests..."
      Tasks: 8
     CGroup: /system.slice/apache2.service
             ├─ 1605 /usr/sbin/httpd-prefork -DSYSCONFIG -C PidFile /var/run/httpd.pid -C Include /etc/apache2/sysconfig.d//loadmodule.conf -C Include /etc/apache2/sys>
             ├─ 1648 /usr/sbin/httpd-prefork -DSYSCONFIG -C PidFile /var/run/httpd.pid -C Include /etc/apache2/sysconfig.d//loadmodule.conf -C Include /etc/apache2/sys>
             ├─ 1649 /usr/sbin/httpd-prefork -DSYSCONFIG -C PidFile /var/run/httpd.pid -C Include /etc/apache2/sysconfig.d//loadmodule.conf -C Include /etc/apache2/sys>
             ├─ 1650 /usr/sbin/httpd-prefork -DSYSCONFIG -C PidFile /var/run/httpd.pid -C Include /etc/apache2/sysconfig.d//loadmodule.conf -C Include /etc/apache2/sys>
             ├─ 1651 /usr/sbin/httpd-prefork -DSYSCONFIG -C PidFile /var/run/httpd.pid -C Include /etc/apache2/sysconfig.d//loadmodule.conf -C Include /etc/apache2/sys>
             ├─ 1652 /usr/sbin/httpd-prefork -DSYSCONFIG -C PidFile /var/run/httpd.pid -C Include /etc/apache2/sysconfig.d//loadmodule.conf -C Include /etc/apache2/sys>
             ├─16196 /usr/sbin/httpd-prefork -DSYSCONFIG -C PidFile /var/run/httpd.pid -C Include /etc/apache2/sysconfig.d//loadmodule.conf -C Include /etc/apache2/sys>
             └─16216 /usr/sbin/httpd-prefork -DSYSCONFIG -C PidFile /var/run/httpd.pid -C Include /etc/apache2/sysconfig.d//loadmodule.conf -C Include /etc/apache2/sys>

Jan 14 13:44:14 opensuse systemd[1]: Starting The Apache Webserver...
Jan 14 13:44:14 opensuse systemd[1]: Started The Apache Webserver.

The installation summary for Nagios 4.4.9:

 General Options:
 -------------------------
        Nagios executable:  nagios
        Nagios user/group:  nagios,nagios
       Command user/group:  nagios,nagios
             Event Broker:  yes
        Install ${prefix}:  /usr/local/nagios
    Install ${includedir}:  /usr/local/nagios/include/nagios
                Lock file:  /run/nagios.lock
   Check result directory:  /usr/local/nagios/var/spool/checkresults
           Init directory:  /usr/lib/systemd/system
  Apache conf.d directory:  /etc/apache2/vhosts.d
             Mail program:  /usr/bin/mail
                  Host OS:  linux-gnu
          IOBroker Method:  epoll

 Web Interface Options:
 ------------------------
                 HTML URL:  http://localhost/nagios/
                  CGI URL:  http://localhost/nagios/cgi-bin/
 Traceroute (used by WAP):  /usr/sbin/traceroute

Solution

  • So, lots of trial and error but I sorted it out. I think the primary issue was that the original 4.4.6 install must have had a different configuration than the 4.4.9 that I installed as an upgrade. There were subtle differences in the paths in the config files but once I got everything properly sorted out, Nagios was displaying as running in the web UI.

    I wish I could detail for you exactly the steps I took, but I tried so many things and changed so many things that I lost track of exactly what the solution was. I can give you the final step. After making changes to the location of the nagios.cfg file in /usr/lib/systemd/system/nagios.service I then saw errors in the log about missing objects in /usr/local/nagios/libexec/ which led me to discover that the 4.4.6 version had those objects in the /usr/local/nagios/lib folder. I copied the file into the libexec folder and it started working.

    I have a recollection that my last upgrade was problematic and hacky and I am probably living with the results of that. I don't remember how Nagios was originally installed so it may go back to an earlier installation.

    I am still suspicious of the current situation and I plan to go back through the details of where everything should be. But for now it is working.