Why doesn't OS X lock files like windows does when copying to a Samba share?

I have a project that uses the .net FileSystemWatcher to watch a Samba network share for video files. When it sees a file, it adds it to an encode queue. When files are dequeued, they are moved to a local directory where the process then encodes the file to several different formats and spits them out to an output directory.

The problem arises because the video files are so big, that it often takes several minutes for them to copy completely into the network directory, so when a file is dequeued, it may or may not have completely finished being copied to the network share. When the file is being copied from a windows machine, I am able to work around it because trying to move a file that is still being copied throws an IOException. I simply catch the exception and retry every few seconds until it is done copying.

When a file is dropped into the Samba share from a computer running OS X however, that IOException is not thrown. Instead, a partial file is copied to the working directory which then fails to encode because it is not a valid video file.

So my question is, is there any way to make the FileSystemWatcher wait for files to be completely written before firing its "Created" event (based on this question I think the answer to that question is "no")? Alternatively, is there a way to get files copied from OS X to behave similarly to those in windows? Or do I need to find another solution for watching the Samba share? Thanks for any help.

Solution

Option 3. Your best bet is to have a process that watches the incoming share for files. When it sees a file, note its size and/or modification date.

Then, after some amount of time (like, 1 or 2 seconds), look again. Note any files that were seen before and compare their new sizes/mod dates to the one you saw last time.

Any file that has not changed for some "sufficiently long" period of time (1s? 5s?) is considered "done".

Once you have a "done" file, MOVE/rename that file to another directory. It is from THIS directory that your loading process can run. It "knows" that only files that are complete are in this directory.

By having this two stage process, you are able to later possibly add other rules for acceptance of a file, since all of those rules must pass before the file gets moved to its proper staging area (you can check format, check size, etc.) beyond a simple rule of just file existence.

Your later process can rely on file existence, both as a start mechanism and a restart mechanism. When the process restarts after failure or shut down, it can assume that any files in the second staging are either new or incomplete and take appropriate action based on its own internal state. When the processing is done it can choose to either delete the file, or move it to a "finished" area for archiving or what not.