I am getting some triggers that show process unavailable, but when I check on the host it runs fine. Here is how the expression for the Trigger is set:
{$hostname:proc.num[,,,/etc/alternatives/java].last()}=0
It seems to be working fine for some hosts, but some of them triggers process unavailable and sends the alert.
Affected host:
# ps ax | grep java
1717 ? Ssl 119:15 /etc/alternatives/java -Dcom.sun.akuma.Daemon=daemonized -Djava.awt.headless=true -Djsse.enableSNIExtension=false -DJENKINS_HOME=/var/lib/jenkins -jar /usr/lib/jenkins/jenkins.war --logfile=/var/log/jenkins/jenkins.log --webroot=/var/cache/jenkins/war --daemon --httpPort=-1 --httpsPort=8443 --ajp13Port=8009 --debug=5 --handlerCountMax=100 --handlerCountMaxIdle=20 --httpsCertificate=/var/lib/jenkins/.ssl/hostssl.crt --httpsPrivateKey=/var/lib/jenkins/.ssl/hostssl.key
Zabbix log:
2000:20160901:081336.721 Starting Zabbix Agent [$hostname]. Zabbix 2.2.8 (revision 51174).
2000:20160901:081336.721 using configuration file: /etc/zabbix/zabbix_agentd.conf
2002:20160901:081336.724 agent #0 started [collector]
2004:20160901:081336.724 agent #2 started [listener #2]
2005:20160901:081336.725 agent #3 started [listener #3]
2006:20160901:081336.725 agent #4 started [active checks #1]
2003:20160901:081336.725 agent #1 started [listener #1]
cat: /proc//status: No such file or directory
cat: /proc//status: No such file or directory
cat: /proc//status: No such file or directory
cat: /proc//status: No such file or directory
Host sending zabbix data properly:
# ps ax | grep java
2472 ? Ssl 1279:26 /etc/alternatives/java -Dcom.sun.akuma.Daemon=daemonized -Djava.awt.headless=true -Djsse.enableSNIExtension=false -Dorg.apache.commons.jelly.tags.fmt.timeZone=Europe/Dublin -DJENKINS_HOME=/var/lib/jenkins -jar /usr/lib/jenkins/jenkins.war --logfile=/var/log/jenkins/jenkins.log --webroot=/var/cache/jenkins/war --daemon --httpPort=-1 --httpsPort=8443 --ajp13Port=8009 --debug=5 --handlerCountMax=100 --handlerCountMaxIdle=20 --httpsCertificate=/var/lib/jenkins/.security/hostssl.crt --httpsPrivateKey=/var/lib/jenkins/.security/hostssl.key --httpsPort=8443
Zabbix log does not contain line cat: /proc//status: No such file or directory
In my understanding problem is that PID of the process is not discovered so it triggers an alert action.
Is there any way to troubleshoot this further so see why the zabbix agent cannot detect the PID of the running process on affected machines?
The problem is resolved now.
I used zabbix_get to get results from the zabbix agent. There I found that it cannot get any processes from the jenkins or any other non-zabbix user.
Googling brought me to this bug: https://bugzilla.redhat.com/show_bug.cgi?id=1032691
Applying custom SELinux policy resolved the issue.