Search code examples
node.jsfs

How can I lock a file while writing to it asynchronously


I've got two node threads running, one watching a directory for file consumption and another that is responsible for writing files to given directories.

Typically they won't be operating on the same directory, but for an edge case I'm working on they will be.

It appears that the consuming app is grabbing the files before they are fully written, resulting in corrupt files.

Is there a way I can lock the file until the writing is complete? I've looked into the lockfile module, but unfortunately I don't believe it will work for this particular application.

=====

The full code is far more than makes sense to put here, but the gist of it is this:

  1. App spins off the watchers and listener

Listener:

  • listen for file being added to db, export it using fs.writeFile

Watcher:

  • watcher uses chokidar to track added files in each watched directory
  • when found fs.access is called to ensure we have access to the file
    • fs.access seems to be unfazed by file being written
  • the file is consumed via fs.createReadStream and then sent to server
    • filestream is necessary as we need the file hash

In this case the file is exported to the watched directory and then reimported by the watch process.


Solution

  • Writing a lock-state system is actually pretty simple. I can't find where I did this, but the idea is to:

    1. create lock files whenever you acquire a lock,
    2. delete them when releasing a lock,
    3. delete them after a timeout has occurred,
    4. throw whenever requesting a lock for a file whose lock file already exists.

    A lock file is simply an empty file in a single directory. Each lock file gets its name from the hash of the full path of the file it represents. I used MD5 (which is relatively slow), but any hashing algo should be fine as long as you are confident there will be no collisions for path strings.

    This isn't 100% thread-safe, since (unless I've missed something stupid) you can't atomically check if a file exists and create it in Node, but in my use case, I was holding locks for 10 seconds or more, so microsecond race conditions didn't seem that much of a threat. If you are holding and releasing thousands of locks per second on the same files, then this race condition might apply to you.

    These will be advisory locks only, clearly, so it is up to you to ensure your code requests locks and catches the expected exceptions.