Search code examples
springspring-bootspring-batchspring-integration

Stream file in chunks with sftp streaming inbound adapter


I am using SFTP streaming inbound channel adapter, and it reads the entire file in one shot. It gives me an input stream that is later transformed by stream transformer I want to read the file directly from the server without downloading in chunks so that if my file is too big, it does not fill the heap. Can I do that with spring integration?

The general idea is to read files from remote file server without downloading to local or temp folder and hand it off to spring batch. Is there a way to do that with spring integration?

I want to use spring batch because if let's say app is processing a file and it dies spring batch potentially could resume from the point it failed, if I remove the entry from metadata store, I could potentially resume.


Solution

  • Well, that is exactly what that SFTP streaming inbound channel adapter does:

    public class SftpStreamingMessageSource extends AbstractRemoteFileStreamingMessageSource<SftpClient.DirEntry> {
    

    where that one is:

    public abstract class AbstractRemoteFileStreamingMessageSource<F>
        extends AbstractFetchLimitingMessageSource<InputStream> implements ManageableLifecycle {
    

    So, the message produced by this MessageSource has an InputStream as a payload. And this one has nothing to do with the memory consumption until you do that yourself. And sounds like you do that exactly by that StreamTransformer:

    /**
     * Transforms an InputStream payload to a byte[] or String (if a
     * charset is provided).
     *
     * @author Gary Russell
     * @since 4.3
     *
     */
    public class StreamTransformer extends AbstractTransformer {
    

    So, if this simple appoach with loading byte array from the file InputStream does not fit into your requirements, you need to look into some other implementation where you would transform that InputStream in whatever manner you need.

    Yo may look into a FileSplitter, which is able to iterate lines of the file and emit individual messages for them: https://docs.spring.io/spring-integration/reference/file/splitter.html#page-title