Search code examples
timestampmulesftpidempotent

Mule SFTP inbound endpoint does not expose timestamp


I have a flow that gets files from an SFTP connector in Mule. The application does not have permission to remove the files once processed.

In order to prevent duplicate processing, the application uses an idempotent filter based on the filename and timestamp.

The SFTP is not returning the timestamp of each file in the message. As a result, the message.inboundProperties.timestamp returns as null.

<!-- SFTP connector -->
<sftp:connector 
    name="SFTP_Origin" 
    validateConnections="true" 
    pollingFrequency="${sftp.origin.pollingFrequency:60000}" 
    fileAge="${sftp.origin.fileAge:60000}" 
    duplicateHandling="overwrite" 
    doc:name="SFTP" 
    archiveDir="${local.archive.directory}"/>

<!-- Move XML files from one SFTP location to another location.
     Archive each file locally on the mule server in the event the file transfer fails.
     Upon failure, notify those responsible so that action can be remedied. -->
<flow name="TransferFilesViaSFTP_Flow" >
    <sftp:inbound-endpoint 
        connector-ref="SFTP_Origin" 
        host="${sftp.origin.host}" 
        port="${sftp.origin.port:22}" 
        path="${sftp.origin.path}" 
        user="${sftp.origin.user}" 
        password="${sftp.origin.password}" 
        responseTimeout="10000" 
        archiveDir="${local.archive.directory}" 
        doc:name="InboundSFTPEndpoint">
        <!-- Use RegEx filter to filter only files with within the proper date format YYYYMMdd
                             Range of dates are from 19000101 to 20991231 -->
        <file:filename-regex-filter pattern="${regex.filter:filename(.*)xml}" caseSensitive="false"/>
    </sftp:inbound-endpoint>
    <!-- Get the files via SFTP -->

    <logger 
        message="#[message]" 
        level="INFO" 
        category="sftp" 
        doc:name="Logger"/>

    <!-- Eliminate redundant file transfers by filtering out files that have 
         been transfered previously, unless their timestamp has changed. -->
    <idempotent-message-filter 
        idExpression="#[message.inboundProperties.originalFilename + '-' + 
                        message.inboundProperties.timestamp]" 
        storePrefix="prefix" 
        doc:name="Filter out redundant files transfers">
        <simple-text-file-store 
            name="filesMessages" 
            directory="${idempotent.directory}"/>
    </idempotent-message-filter>

    <!-- log event information -->
    <logger
        message="#['Payload after SFTP is ' + payload]"
        level="DEBUG"
        doc:name="Payload Logger" 
        category="sftp"/>

    <!-- Send the files to Windows Share -->
    <file:outbound-endpoint 
        path="${windows.share.path}" 
        connector-ref="WindowsShareFile" 
        responseTimeout="10000" 
        doc:name="File"/>

    <!-- log event information -->
    <logger 
        message="#['file ' + message.inboundProperties.'originalFilename' + ' successfully processed.']" 
        level="INFO" 
        category="sftp" 
        doc:name="Logger: Success"/>

    <!-- Based on property, smtp.onSuccess.sendEmail, notify when files have been moved successfully. -->
    <expression-filter 
        expression="#[${smtp.onSuccess.sendEmail:false}]" 
        doc:name="onSuccess Send Email"/>

    <set-session-variable 
        variableName="emailSubject" 
        value="#['Mule Flow Successful']" 
        doc:name="emailSubject"/>

    <flow-ref 
        name="aggregateSuccessfulFilesTransferForEmail_Subflow" 
        doc:name="aggregateSuccessfulFilesTransferForEmail_Subflow"/>

    <!-- default exception strategy -->
    <exception-strategy 
        ref="transferFilesExceptionStrategy" 
        doc:name="Reference Exception Strategy"/>
</flow>

I found what appears to be an unresolved issue with this on Mulesoft. https://www.mulesoft.org/jira/browse/MULE-7175

Is there an alternative method that I can get the timestamp of the file from the SFTP connector?


Solution

  • If the file is small enough to fit in memory:

    1. Transform it into a byte[] after sftp:inbound-endpoint
    2. Compute a SHA hash of the payload instead of trying to use the timestamp of the remote file. It will work and be more reliable anyway:

      idExpression="#[message.inboundProperties.originalFilename + '-' + 
                      org.apache.commons.codec.digest.DigestUtils.sha256Hex(message.payload)]"