Search code examples
apache-camelcamel-ftp

Camel File language SortBy date pattern system time or file name pattern?


I need to sort a ftp directory by another means, I am using sortBy=file:modifed option. However, this last modified time is not fitting my use case. sometimes a file received over ftp lags or preceeds another. the contents are time series sensitive data, the file names are published with a timestamp.

example: fileName1_2018-12-14_12-34-33.csv system time 03:30:23 fileName2_2018-12-14-12-35-22.csv system time 03:30:03

clearly the fileName1 should be consumed first but some how the system modified time suggests that the fileName2 gets consumed first. the files created in proper order, however, written to the system completed out of order. some craziness occurs? but bottom line I need to consume fileName1 before fileName2. so, I can't use the sortBy=file:modified.

I am thinking to use something like maybe just sorting by lexicographically. I am looking at sortBy=date:file:yyyyMMdd;file:name but cannot figure out if the date pattern is for the system time or I can use it as the pattern for the files?

I hope this makes sense.

long story short, is the date pattern used in sortBy a pattern within the file name or is this the modified time or system time? otherwise I can simply sort lexical I guess. thanks !!!

        final String fromStr = String.format("%s://%s@%s:%s/%s?password=RAW(%s)&recursive=%s&stepwise=%s&useList=%s&passiveMode=%s&disconnect=%s"
                + "&move=.processed"
                + "&maxMessagesPerPoll=100"
                + "&eagerMaxMessagesPerPoll=false"
                + "&sortBy=file:modified"
                //+ "&passiveMode=true"
                + "&sendEmptyMessageWhenIdle=false"
                //+ "&stepwise=false"
                + "&delay=10000"
                + "&initialDelay=5000"
                + "&connectTimeout=10000"
                , transport, username, host, port, path, password, recursive, stepwise, useList, passiveMode, disconnect);

Solution

  • The sort by file:modified is the timestamp of the file. For FTP files the timestamp are even less precise that normal files as it depend on the FTP server list operation which often only returns time in hour and minutes only.

    In your use-case the file names itself has timestamps included, so you should sort by file name, and not file modified.