Search code examples
javafileapache-camelstreaming

Streaming file with camel and readLock=none


I am trying to consume (stream) a big zip file with Apache Camel. The streaming should begin as soon as the file is being written to. Below is the file consumer code.

        rest("/api/request/{Id}/")
            .get()
            .produces(MediaType.APPLICATION_OCTET_STREAM_VALUE)
            .process(new FindFileName)
            .pollEnrich().simple("file:" + outputDir + "?fileName=${property.filnavn}&noop=false&readLock=none&delete=true").timeout(pollTimeout)

Claus Ibsen suggested using readLock=none to get the stream. When I use the option the stream closes right away and I only get the 0 byte file with the correct filename.

How do I configure camel's file endpoint to use readLock=none and consume the file until it is completed?

A seperate route writes the file.


Solution

  • There is no safe way to know when a file is completed written by a 3rd party. What you do there, is that you get a hold of a java.io.File in the poll enrich to the file. Which Camel can convert to a FileInputStream to read from. But that stream has no way of knowing when the 3rd party if finished writing the file.

    There its really a bad practice to read files that are currently in progress of being written.

    To know when a file is complete written then 3rd parties may use a strategy to

    • write a 2nd dummy marker file to tell its finished
    • write a 2nd in-progress dummy file to tell the file is currently being written and delete this file when its finished
    • write the file using a temporary name and rename when done
    • write the file in another folder and move when done
    • monitor the file for modified timestamp and if the timestamp doesnt change after X period then assume its finished written
    • attempt to rename the file and assuming if the OS fails doing this then the 3rd party is still writing to the file
    • etc...

    The JDK File Lock API does not work acrosss file systems and is generally not very useable to get file locks - it may work from within the same JVM, but not when its 2 different systems.