Search code examples
spring-integrationspring-integration-dsl

Fetching file from FTP using Spring Integration


I have an project where I need to fetch .csv files from Remote FTP folder the issue is that the files are big lets say 6-7 MB and the poll starts reading them even though the are not yet fully transferred from the third party, it is cause an exception.

I saw that we can use a LastModifiedFileListFilter but not sure if this is the proper solution.

Here is my code sample.

@Bean
    public IntegrationFlow ftpInboundFlow() {
        return IntegrationFlows
                .from(Ftp.inboundAdapter(ftpSessionFactory())
                                .preserveTimestamp(true)
                                .remoteDirectory("/ftp/GE/Inbound")
                                .patternFilter("*.csv")
                                .deleteRemoteFiles(true)
                                .localDirectory(new File("inbound"))
                                .temporaryFileSuffix(TEMPORARY_FILE_SUFFIX),
                        e -> e.id("ftpInboundAdapter")
                                .poller(Pollers.fixedDelay(5000))
                                .autoStartup(true))
                .transform(e -> {
                    log.info("Sending CSV file " + e + " to FTP server");
                    return e;
                })
                .handle(Ftp.outboundAdapter(ftpSessionFactory())
                        .useTemporaryFileName(true)
                        .autoCreateDirectory(true)
                        .remoteDirectory("/ftp/GE/Inbound/history"))
                .get();
    }

Exception:

Caused by: org.springframework.messaging.MessagingException: Failure occurred while copying '/ftp/GE/Inbound/OA_ex_PK_2020_2021.csv' from the remote to the local directory; nested exception is java.io.IOException: Failed to copy '/ftp/GE/Inbound/OA_ex_PK_2020_2021.csv'. Server replied with: 550 The process cannot access the file because it is being used by another process. 

Solution

  • The LastModifiedFileListFilter is for local file system. However I think an idea is good and there is just enough to implement similar "last modified" for FTP. See FtpPersistentAcceptOnceFileListFilter and how it takes the modified option from the remote entity.

    Such an FtpLastModifiedFileListFilter has to be the first in the ChainFileListFilter you have to provide into your Ftp.inboundAdapter. The second one would be indeed your new FtpSimplePatternFileListFilter("*.csv"). The last one in the chain has to be AcceptOnceFileListFilter or its FTP variant: FtpPersistentAcceptOnceFileListFilter.

    The solution with the FtpLastModifiedFileListFilter could be contributed back to the framework since you are not the first who is asking for similar solution.

    Another way is to change the writing logic via tmp file. So, the final files are not visible for you until those tmp files are fully written and renamed respectively.

    And one more solution is just to ignore that exception and things eventually to settle by itself. The point is that FtpPersistentAcceptOnceFileListFilter, used by default internally, is a ReversibleFileListFilter<F>, ResettableFileListFilter<F>, so when exception happens in the FtpInboundFileSynchronizer.copyFileToLocalDirectory(), it has this logic:

        catch (RuntimeException | IOException e1) {
            if (filteringOneByOne) {
                resetFilterIfNecessary(file);
            }
            else {
                rollbackFromFileToListEnd(filteredFiles, file);
            }
            throw e1;
        }
    

    So, the failed file is removed from the filter store, therefore it will become again on the next polling cycle.