Search code examples
clinuxasynchronoussynchronizationglibc

Linux synchronization without polling


In principle what I want is very simple.

Two executables ./read and ./write respectively read and write from a resource (let's say a file). Using flock(2) it is easy to prevent race conditions between arbitrary invocations of ./read and ./write at arbitrary times.

The requirement is that each invocation of ./read contains a snapshot of the resource from a previous invocation, and if the current resource matches the snapshot, ./read should wait (sleep) until an invocation of ./write changes the resource.

From what I gather, the program flows of each program should be something like:

//read.c
obtain mutex0
  read resource
  is resource same as our snapshot?
    release mutex0 [1]
    sleep until ./write says to wake up [2]
    obtain mutex0
    read resource
  do something with resource
release mutex0

//write.c
obtain mutex0
  change resource in some way
  tell any sleeping ./read's to wake up
release mutex0

The main problem with this approach is that there is a tangible delay between the lines labelled [1] and [2]. This means that a ./read can release mutex0 at [1], an entire invocation of ./write can complete, and then [2] executes, but will stall indefinitely because ./write already tried to wake up any sleeping ./reads before.

Is there no easy way to do what I want, besides using an entire separate full-blown server process? Also, for those curious I want to do this for an application in CGI.


Solution

  • No, the program flow for the reader is incorrect. You need some sort of locking mechanism to prevent writes while one or more reads are in progress, and some sort of wakeup mechanism to notify readers whenever a write is completed.

    Your program flow for the writer(s) is okay:

        # Initial read of file contents
        Obtain lock
            Read file
        Release lock
    
        # Whenever wishes to modify file:
        Obtain lock
            Modify file
            Signal readers
        Release lock
    

    The program flow for the reader(s) should be:

        # Initial read of file contents
        Obtain lock
            Read file
        Release lock
    
        # Wait and respond to changes in file
        On signal:
            Obtain lock
                Read file
            Release lock    
            Do something with modified file contents
    

    If there is only one reader, then a mutex (pthread_mutex_t) in shared memory (accessible to all writers and the reader) suffices; otherwise, I recommend using an rwlock (pthread_rwlock_t) instead. For waking up any waiting readers, broadcast on a condition variable (pthread_cond_t). The difficulty, of course, is setting up that shared memory.


    Advisory file locking and the fanotify interface is also sufficient. Readers install a fanotify FAN_MODIFY mark, and simply wait for the corresponding event. Writers do not need to co-operate, except for the use of an advisory lock (which exists only to stop readers from reading while the file is modified).

    Unfortunately, the interface currently requires the CAP_SYS_ADMIN capability, which you definitely do not want random CGI programs to have.


    Advisory file locking and the inotify interface is sufficient, and I believe the most appropriate for this, when both readers and writers open and close the file for each set of operations. The program flow for this case for the reader(s) is:

    Initialize inotify interface
    Add inotify watch for IN_CREATE and IN_CLOSE_WRITE for "file"
    
    Open "file" read-only
        Obtain shared/read-lock
            Read contents
        Release lock
    Close "file"
    
    Loop:
        Read events from inotify descriptor.
        If IN_CREATE or IN_CLOSE_WRITE for "file":
            Open "file" read-only
                Obtain shared/read-lock
                    Read contents
                Release lock
            Close "file"
            Do something with file contents
    

    The writer is still just

        # Initial read of file contents
        Open "file" for read-only
            Obtain shared/read-lock on "file"
                Read contents
            Release lock
        Close "file"
    
        # Whenever wishes to modify file:
        Open "file" for read-write
            Obtain exclusive/write-lock
                Modify file
            Release lock
        Close "file"
    

    Even if the writers do not obtain the lock, the readers will be notified when a writer closes the file; the only risk is that another set of changes is written (by another lock-spurning modifier) while the readers are reading the file.

    Even if a modifier replaces the file with a new one, the readers get correctly notified when a new one is ready (either renamed/linked on top of the old one, or the new file creator closes the file). It is important to note that if the readers keep the file open, their file descriptors will not magically jump to the new file, and they will only see the old (probably deleted) contents.


    If it is for some reason important that readers and writers do not close the file, the readers can still use inotify, but an IN_MODIFY mark instead, to be notified whenever the file is truncated or written to. In this case, it is important to remember that if the file is then replaced (renamed over, or deleted and recreated), the readers and writers will not see the new file, but will operate on the old, now invisible-in-the-filesystem file contents.

    The program flow for the reader:

    Initialize inotify interface
    Add inotify watch for IN_MODIFY for "file"
    
    Open "file" read-only
        Obtain shared/read-lock
            Read contents
        Release lock
    
        Loop:
            Read events from inotify descriptor.
            If IN_CREATE or IN_CLOSE_WRITE for "file":
                Obtain shared/read-lock on "file"
                    Read contents
                Release lock
                Do something with file contents
    

    The program flow for the writer is still almost the same:

        # Initial read of file contents
        Open "file" for read-only
            Obtain shared/read-lock on "file"
                Read contents
            Release lock
        Close "file"
    
        Open "file" for read-write
    
        # Whenever writer wishes to modify the file:
        Obtain exclusive/write-lock
            Modify file
        Release lock
    

    It may be important to note that the inotify events occur after the fact. There is usually some small latency, which might depend on the load on the machine. So, if prompt response to file changes is important for the system to work correctly, you may have to go with a mutex or rwlock and a condition variable in shared memory approach instead.

    In my experience, these latencies tend to be shorter than the typical human reaction interval. Therefore, I consider -- and I suggest you do so too -- the inotify interface fast and reliable enough at human timescales; not so at millisecond and sub-millisecond machine timescales.