I've written a simple Python script that monitors the number of file descriptors on a Red Hat system.
When comparing to the lsof
command I get two different results.
Broken down to it's core, the script does this:
import psutil
p = psutil.Process(PID)
print(p.num_fds())
Currently num_fds()
reports 60 open file descriptors.
While for the same PID the result of lsof -p PID | wc -l
yields 167.
Where is this discrepandy coming from?
My understanding was that both, num_fds()
and lsof
, both report the same file descriptors, including open file handles, sockets, pipes, etc.
Small background: some processes seem to open sockets and/or file handles without closing them ever again. Thus after a longer period of time the process reaches it's limit of file descriptors and crashes. This tool is intended to monitor this process wether the number of file desriptors is constantly rising.
Looking at psutil source in GitHub, the implementation of this method is:
def num_fds(self):
return len(os.listdir("%s/%s/fd" % (self._procfs_path, self.pid)))
So it's just counting the number of file descriptors listed in /proc/pid or an equivalent location.
Looking at my current process on a Red Hat server, the pid directory contains 5 file descriptors; but lsof
reports 19 open files. Looking at the full lsof
output, the difference is that it lists things that don't appear to be associated with numeric file descriptors. The lsof
man page describes:
FD is the File Descriptor number of the file or:
cwd current working directory;
Lnn library references (AIX);
err FD information error (see NAME column);
jld jail directory (FreeBSD);
ltx shared library text (code and data);
Mxx hex memory-mapped type number xx.
m86 DOS Merge mapped file;
mem memory-mapped file;
mmap memory-mapped device;
pd parent directory;
rtd root directory;
tr kernel trace file (OpenBSD);
txt program text (code and data);
v86 VP/ix mapped file;
So, the discrepancy is because lsof
includes a variety of "open files" that are not actually mapped to file descriptors.