I am using flock
within an HPC application on a file system shared among many machines via NFS. Locking works fine as long as all machines behave as expected (Quote from http://en.wikipedia.org/wiki/File_locking: "Kernel 2.6.12 and above implement flock calls on NFS files using POSIX byte-range locks. These locks will be visible to other NFS clients that implement fcntl-style POSIX locks").
I would like to know what is expected to happen if one of the machines that has acquired a certain lock unexpectedly shuts down, e.g. due to a power outage. I am not sure where to look this up. My guess is that this is entirely up to NFS and its way to deal with NFS handles of non-responsive machines. I could imagine that the other clients will still see the lock until a timeout occurs and the NFS server declares all NFS handles of the machine that timed out as invalid. Is that correct? What would that timeout be? What happens if the machine comes up again within the timeout? Can you recommend a definite reference to look all of this up?
Thanks!
When you use NFS v4 (!) the file will be unlocked when the server hasn't heard from the client for a certain amount of time. This lease period defaults to 90s.