Search code examples
pythonnagiosnrpe

NRPE Python script output bug


I have been tasked with making a custom python script (since i'm bad with Bash) to run on a remote NRPE client which recursively counts the number of files in the /tmp directory. This is my script:

#!/usr/bin/python3.5
import os
import subprocess
import sys
file_count = sum([len(files) for r, d, files in os.walk("/tmp")]) #Recursive check of /tmp




if file_count < 1000:
        x = subprocess.Popen(['echo', 'OK -', str(file_count), 'files in /tmp.'], stdout=subproce$
        print(x.communicate()[0].decode("utf-8")) #Converts from byteobj to str
#       subprocess.run('exit 0', shell=True, check=True) #Service OK  - exit 0
        sys.exit(0)

elif 1000 <= file_count < 1500:
        x = subprocess.Popen(['echo', 'WARNING -', str(file_count), 'files in /tmp.'], stdout=sub$
        print(x.communicate()[0].decode("utf-8")) #Converts from byteobj to str
        sys.exit(1)
else:
        x = subprocess.Popen(['echo', 'CRITICAL -', str(file_count), 'files in /tmp.'], stdout=su$
        print(x.communicate()[0].decode("utf-8")) #Converts from byteobj to str
        sys.exit(2)

EDIT 1: I tried hardcoding file_count to 1300 and I got a WARNING: 1300 files in /tmp. It appears the issue is solely in the nagios server's ability to read files in the client machine's /tmp.

What I have done:

  • I have the script in the directory with the rest of the scripts.
  • I have edited /usr/local/nagios/etc/nrpe.cfg on the client machine with the following line:

    command[check_tmp]=/usr/local/nagios/libexec/check_tmp.py
    
  • I have edited this /usr/local/nagios/etc/servers/testserver.cfg file on the nagios server as follows:

    define service {
            use                             generic-service
            host_name                       wp-proxy
            service_description             Files in /tmp
            check_command                   check_nrpe!check_tmp
    
    }
    

The output:
correct output is: OK - 3 files in /tmp

  • When I run the script on the client machine as root, I got a correct output
  • When I run the script on the client machine as the nagios user, I get a correct output
  • My output on the Nagios core APPEARS to be working, but it shows there are 0 files in /tmp when I know there are more. I made 2 files on the client machine and 1 file on the nagios server.

The server output for reference:

https://puu.sh/BioHW/838ba84c3e.png

(Ignore the bottom server, any issues solved with the wp-proxy will also be changed on the wpreess-gkanc1)

EDIT 2: I ran the following on the nagios server:

/usr/local/nagios/libexec/check_nrpe -H 192.168.1.59 -c check_tmp_folder 

I indeed got a 0 file return. I still don't know how this can be fixed, however.


Solution

  • SOLVED!

    Solution:

    • Go to your systemd file for nrpe. Mine was found here:

      /lib/systemd/system/nrpe.service
      
    • If not there, run:

      find / -name "nrpe.service"
      

    and ignore all system.slice results

    • Open the file with vi/nano
    • Find a line which says PrivateTmp= (usually second to last line)
    • If it is set to true, set it to false
    • Save and exit the file and run the following 2 commands:

      daemon-reload
      restart nrpe.service
      

    Problem solved.

    Short explanation: The main reason for that issue is, that with debian 9.x, some processes which use systemd forced the private tmp directories by default. So if you have any other programs which have issues searching or indexing in /tmp, this solution can be tailored to fit.