Search code examples
pythonlinuxdjangoubuntu-12.04nfs

python os.path.exists() failing for nfs mounted directory file that exists


I basically have a webserver for a site and another that simply stores files. The file server is connected to the main one by mounting one of its directories. The website runs Django so I primarily deal with python. Anyway, I seem to have run into some issues with files being reported as not existing even when they're actually there.

Essentially when I call

filepath = '/path/to/file/on/nfs/share'
exists = os.path.exists(filepath)

exists is false even though the file actually exists, and I know it does because I have timestamps printed to a log file that show exactly when it was created. I'm not sure what could be the problem but I know the docs for os.path.exists say

On some platforms, this function may return False if permission is not granted to execute os.stat() on the requested file, even if the path physically exists.

I know that isn't the case because the files both share the same group and group number, which also shares the same group numbers on both servers. Could it possibly be a stale cache or something like that?

My mounting is done automatically through fstab.

Client side, the settings are:
filehost:/filefolder /localfolder nfs defaults,rsize=32768,wsize=32768

Server side, the settings are:
/filefolder webserver(rw,sync,no_root_squash,no_subtree_check)

Edit:

So, I guess for more information/specifics. I'm running a Python subprocess that generates a file in the remote directory. When a request is made, it starts the subprocess and returns the expected location of the file.

On the frontend, there is a url that is pinged, where it calls os.path.exists() for that file and when it does the resource is then loaded through ajax.

The suspected problem is that sometimes this pinger will report that the file isn't available for a few seconds after it actually is. That's also the reason I thought maybe it was a possible issue with a stale cache.

All the files and the directory in them are owner/group www-data, as well as any subprocesses being instantiated by django. Also this problem doesn't seem to be completely repeatable. Sometimes it will work quickly while others it will take a few seconds longer than expected


Solution

  • This is due to NFS cache as found here:

    Attribute cache caches everything in struct stat, so stat() and fstat() calls can be returned from cache. If you need to see a file's latest size or mtime (or other fields), you need to flush the file's attribute cache before the stat() call.

    Note that if file handle is cached, stat() returns information for that cached file (so the result is the same as for fstat()). If you need to stat() the latest file with the given file name, flush the file handle cache first.

    I think its stat failing cause the file is not in the cache yet. I found this in the NFS man page:

    ac / noac - Selects whether the client may cache file attributes. If neither option is specified (or if ac is specified), the client caches file attributes.

    But there is a warning there too, so i just lived with the delay:

    Using the noac option provides greater cache coherence among NFS clients accessing the same files, but it extracts a significant performance penalty. As such, judicious use of file locking is encouraged instead. The DATA AND METADATA COHERENCE section contains a detailed discussion of these trade-offs.