Search code examples
pythonpython-os

What does the dir_fd argument of os.fwalk() do?


If I assign an integer to the dir_fd argument of os.fwalk(), a fourth value ist added to each of the tuples generated by list(os.fwalk()).

I understand that they have something to do with the hierarchy in which the files and directories are organised, but I don't quite get their exact meaning.

Also, the values change, depending on the integer assigned to dir_fd, and there is always one number missing (in this case 82, see below).

Any ideas?

Code:

import os

os.chdir("/home/test")
inp = str(os.getcwd() + "/input")

l = list(os.fwalk(inp, dir_fd=3))

Output:

[('/home/test/input', ['a', 'b', 'c'], ['d.txt'], 80),
 ('/home/test/input/a', ['aa'], ['ac.txt', 'ab.txt'], 81),
 ('/home/test/input/a/aa', [], [], 83),
 ('/home/test/input/b', [], ['bb.txt', 'bc.txt', 'ba.txt'], 81),
 ('/home/test/input/c', ['ca'], [], 81),
 ('/home/test/input/c/ca', ['caa'], ['cab.txt'], 83),
 ('/home/test/input/c/ca/caa', [], ['caaa.txt'], 84)]

Solution

  • The documentation for dir_fd is confusingly located. Here's what it says:

    paths relative to directory descriptors: If dir_fd is not None, it should be a file descriptor referring to a directory, and the path to operate on should be relative; path will then be relative to that directory. If the path is absolute, dir_fd is ignored. (For POSIX systems, Python will call the variant of the function with an at suffix and possibly prefixed with f (e.g. call faccessat instead of access).

    You can check whether or not dir_fd is supported for a particular function on your platform using os.supports_dir_fd. If it’s unavailable, using it will raise a NotImplementedError.

    So if you pass a dir_fd, then fwalk will interpret the path argument as relative to the directory specified by the file descriptor.

    (It sounds like you also don't know what a file descriptor even is. A file descriptor is an integer identifying an open file or directory. You can get one with os.open or a few other ways. The advantage of using a file descriptor instead of a path here is that a file descriptor remains valid even if stuff gets moved around or renamed, invalidating old paths.)