Search code examples
springspring-integrationspring-integration-sftp

Spring integration SFTP - issue with filters and number of messages emits


I started using spring integration SFTP and I have some questions.

  1. Filters not working. I have example configuration:
Sftp.inboundAdapter(ftpFileSessionFactory())
                .preserveTimestamp(true)
                .deleteRemoteFiles(false)
                .remoteDirectory(integrationProperties.getRemoteDirectory())
                .filter(sftpFileListFilter()) // doesn't work
                .patternFilter("*.xlsx") // doesn't work

And my ChainFileListFilter:

private ChainFileListFilter<ChannelSftp.LsEntry> sftpFileListFilter() {
        ChainFileListFilter<ChannelSftp.LsEntry> chainFileListFilter = new ChainFileListFilter<>();
        chainFileListFilter.addFilter(new SftpPersistentAcceptOnceFileListFilter(metadataStore(), "INT"));
        chainFileListFilter.addFilter(new SftpSimplePatternFileListFilter("*.xlsx"));
        return chainFileListFilter;
    }

If I understand correctly, only the XLSX file should be saved in the local directory. If yes it doesn't work with this configuration. Am I doing something wrong or misunderstood this?

  1. How I can configure SFTP that each downloaded file emit message? I see in the doc two params max-messages-per-poll and max-fetch-size, but I don't know how to set it up so that every file emits a message. I would like to sync files once every 24 hours and produce batch job queue. Maybe there is a workaround?

  2. Is there built-in filter which allow me fetch only files with changed content? The best solution would be to check the checksums of the files.

I will be grateful for your help and explanations.


Solution

  • You cannot combine filter() and patternFilter(). Only one of them can be used: the last one overrides whatever you used before. In other words: or filter() or patternFilter() - not both. By default the logic is like this:

    public SftpInboundChannelAdapterSpec patternFilter(String pattern) {
        return filter(composeFilters(new SftpSimplePatternFileListFilter(pattern)));
    }
    
    private CompositeFileListFilter<ChannelSftp.LsEntry> composeFilters(FileListFilter<ChannelSftp.LsEntry>
            fileListFilter) {
        CompositeFileListFilter<ChannelSftp.LsEntry> compositeFileListFilter = new CompositeFileListFilter<>();
        compositeFileListFilter.addFilters(fileListFilter,
                new SftpPersistentAcceptOnceFileListFilter(new SimpleMetadataStore(), "sftpMessageSource"));
        return compositeFileListFilter;
    }
    

    So, technically you don't need your custom one, if you don't use external persistent MetadataStore. But if you do, think about flipping SftpSimplePatternFileListFilter with SftpPersistentAcceptOnceFileListFilter. Since it is better to check for the pattern before storing the file into MetadataStore.

    It is the fact that every synched remote file, passed those filters, is stored into local dir and the message for that local file is emitted immediately when the poller does a request.

    The maxFetchSize plays the role when we load remote files into a local dir. The maxMessagesPerPoll is used from the poller, but those are already built from the local files. The message is emitted per local file, not as a batch for all of them. That's not what messaging is designed for.

    Please, share more info what does not work with files. The SftpPersistentAcceptOnceFileListFilter checks not only file name, but also mtime of the file. So, that it not about any checksum, but more last modified timestamp of the file.