I'm integrating watchman via the socket/bser interface in a JVM program.
I'm seeing odd timing where:
However, somehow, Thread B is reading an empty file.
Which, I assume is validly empty at some point, e.g. the IO/syscalls might be:
And I assume my Thread B is reading the file between steps 1 and 2. Or maybe 1 and 4, if 4 is when the result is flushed.
My confusion is two fold:
1) I thought watchman's default 20ms wait would account for things like this, and I'd only see an update on my thread A, let alone when my thread B does a read, after step 4, and the data is done being written to the file.
2) Even if watchman did tell me "too soon" about the 1st syscall (say step 1), and I read the results while it was an empty file, there should be another syscall/watchman notification that "btw, the file has some content now".
FWIW/oddly enough, I was seeing this very same behavior when using the Java WatchService API, where I would get an inotify event, but read a file "too soon", and so get either empty or partial results, and then no follow up inotify event when the rest of the data was available.
I assumed this was a fluke/nuance of the WatchService, so I solved it at the time by checking the file mod time before reading it, and just waiting to ensure mod time >2 seconds old before assuming the file is "done" being written.
(Note that this also handled ~100mb+ files being written, where the build process might write a chunk of data every 100ms+, but with WatchService I was seeing 100s of inotify notifications for what was essentially a single continuous write.)
When I ported my WatchService code to watchman, I dropped this "ensureSettled" hack, because I assumed watchman's 20ms settle period (which is way lower than the 2s I was using, but hey it's the default) + it's general robustness compared to the somewhat beta WatchService would mean it wouldn't be a problem.
But within ~a day of using the watchman-ported code, I'm seeing empty file reads, just like I was with the WatchService.
Any ideas about what I'm missing?
I can add back the ensureSettled hack, but at this point I'm curious about what is going on.
The docs aren't very clear on this, sorry!
Dispatching of subscription notifications is subject to the settle timeout, but since file updates are non-atomic it's likely that the default 20ms kicks in before the file contents are visible to you; under the covers, the kernel generates a series of notifications for the various mutations that you're doing, so if the truncate takes 20ms before you write (or perhaps flush) the data out, you'll likely get a notification "in the middle".
This stuff is also operating system dependent. Here's an example of a recently discovered and resolved issue: https://github.com/facebook/watchman/commit/bac383c751b248ae742a2a20df3e8272238c0ae2 it doesn't sound like it is quite the same thing as you're experiencing, it just adds some color to this discussion.
If you already have code to manage the settling in your client, then it may be easier for you to add that back; we do this in watchman-make
for example.
You may also wish to try setting https://facebook.github.io/watchman/docs/config.html#settle in a .watchmanconfig
file in the root of the directory tree that you're watching and leave that to the watchman server. If/when you change this setting, you will need to delete and restart the watch.
Which you choose depends on how you want to trade ease of configuration against volume of code you want to maintain and (perhaps) volume of support questions from your user base if the .watchmanconfig
isn't correctly configured for them.
Note that you can use the command invocation from https://facebook.github.io/watchman/docs/cmd/log-level.html to see the debug logging for the kernel notifications as they come in in real time; this may be helpful for you in understanding exactly which notifications are coming in and when.
Just curious, are you using https://github.com/facebook/watchman/tree/master/java to talk to the watchman server?