I was digging into the Flume-ng's SpoolingDirectorySource
src and found that it polls the spool directory after the specified POLL_DELAY_MS
parameter to generate new events. These events are then handled by ReliableSpoolingFileEventReader
in a seperate thread.
I was wondering why ReliableSpoolingFileEventReader
does not use WatchService
API, which is pretty low level as well as thread-safe. Is there any specific design constraint which favored polling over watcher? \
Thanks.
In general, Flume works better with batches of events. This is because the File channel fsync
s for every batch. Thus, waiting for a period of time is a good trade-off to collect a batch of events.