Search code examples
javamultithreadingjakarta-eecluster-computingapache-commons-vfs

Java cluster, run task only once


We have a java process, which listen's to a directory X on the file system using apache commons vfs. Whenever a new file is exported to this directory, our process kicks in. We first rename the file to filename.processing and parse the file name, get some information from the file and insert into tables, before sending this file to a Document management system. This is a single-threaded application per cluster. Now consider this running in a cluster environment, we have 5 server's. So 5 different VM's are trying to get access of the same file. The whole implementation was on the basis that only one process can rename the file to .processing at a given time, as OS will not allow multiple processes modifying the file at the same time. Once a cluster get's holds and renames file to .processing, other cluster's will ignore files which are of format .processing.

This was working fine since more than a year, but just now we found few duplicates. It looks like multiple cluster's got hold of the file, in this case say cluster a, b, c have got access of the file f.pdf and they renamed it to f.pdf.processing at the same time,(i am still baffled how OS allows modifying the file at the same time). As a result of these, cluster a,b,c they processed the file and send it to document management system. So now there are 3 duplicate files.

So in short what i am looking at is, approaches to run task only once in a cluster environment. I also want it to have a failover mechanism, so that if something went wrong with the cluster, another cluster picks up the task. We don't want to set env variable, like master=true on a box, as that will limit it to only one cluster and will not handle failover.

Any kind of help is appreciated.


Solution

  • We are implementing our own synchronization logic using a shared lock table inside the application database. This allows all cluster nodes to check if a job is already running before actually starting it itself.