I have wrote a script which performs like supervisord detecting a process is down or not. When a server is down then start it. Sometimes i found that the process was running but the script thinked it was down.
def check_status(service, port):
"""
check_the service status.
args:
service: the name of the service.
port:
"""
cmd = "netstat -lntp | grep %s | grep %s | awk -F '[:]' '{print $2}'" % (service, port)
logger.info(cmd+"\n")
results = os.popen(cmd).readlines()
logger.info(results)
return bool(results)
here is the log:
2017-04-02 07:53:02,006,1491090782.006675,INFO-netstat -lntp | grep uwsgi | grep 8083 | awk -F '[:]' '{print $2}'
2017-04-02 07:53:02,043,1491090782.043374,INFO-[]
2017-04-02 07:53:02,043,1491090782.043619,INFO-2017-04-02 07:53:02 [ERROR] uwsgi:8083 is down.
2017-04-02 07:53:02,043,1491090782.043733,INFO-2017-04-02 07:53:02 [INFO] try to start uwsgi:8083
2017-04-02 07:53:02,043,1491090782.043814,INFO-cmd:sh /usr/local/sandai/webrtc-env/apprtc/sbin/apprtc.sh start 8083
2017-04-02 07:53:03,100,1491090783.100647,INFO-netstat -lntp | grep uwsgi | grep 8083 | awk -F '[:]' '{print $2}'
2017-04-02 07:53:03,138,1491090783.138201,INFO-['8083 0.0.0.0\n']
2017-04-02 07:53:03,138,1491090783.138506,INFO-2017-04-02 07:53:03 [INFO] uwsgi have been started.
but when i used ps -ef | grep uwsgi | grep 8083 i found that the server is not down:
[ops01@test 2017.04.02]# ps -ef | grep uwsgi | grep 8083
ops01 22684 1 0 2016 ? 00:03:14 uwsgi --plugin http,python,gevent --http :8083
is it not proper to use netstat to detect a process is down or not ? and why? thanks
"Server running" and "server listening on the port" are essentially two different things. Depending on how the server is implemented, it can happen, that process itself is running but it was unable to start listening on the port. Also, there is always some window between starting the server and server actually starting to listen on the port.
I usually use two separate processes for this purpose: