Search code examples
javaapachefileftpftp-client

Data truncation from FTP download using Apache


Using the Apache FTP client from commons-net:commons-net:3.6, I'm reading files from our FTPS server with

FTPClient#retrieveFile("/OUT/somefile.xml", someBAOS)

Normally, everything works well, but sometimes the file gets truncated.

This is the protocol when everything works:

< 220 ProFTPD 1.3.5a Server (someserver) [::ffff:...]
> AUTH TLS
< 234 AUTH TLS successful
> PBSZ 0
< 200 PBSZ 0 successful
> PROT P
< 200 Protection set to Private
> USER someuser
< 331 Password required for someuser
> PASS ***
< 230 User someuser logged in
> TYPE I
< 200 Type set to I
> PASV
< 227 Entering Passive Mode (...).
> RETR /OUT/somefile.xml
< 150 Opening BINARY mode data connection for /OUT/somefile.xml (4769503 bytes)
< 226 Transfer complete

When the file got truncated, the smaller size was logged:

< 150 Opening BINARY mode data connection for /OUT/somefile.xml (2569402 bytes)

The truncation happens sporadically. On the next download, one hour later, everything is fine again. We're pretty sure that the file hasn't changed during this period.

The log file was produced using SocketClient#addProtocolCommandListener and I'm pretty sure that the changed size doesn't come from my listener. I guess, the text was produced by the FTP server and dumped as is. Can someone confirm that the file size indeed comes from the server (rather than being added by the Apache client)?

Funnily, the downloaded truncated file has 2602133 bytes (and I'm rather sure, there are no \rs added by text conversion or alike; Firstly, we do conversion; secondly, the difference is 31371 bytes and there are 56577 lines there).

The most probable explanation would be someone changing the file in the meantime, but the server logs clearly say that there was nobody else at the time.

Any idea how to find out what's going on?

The result

I've got some more logs showing clearly that there were uploads around the time of the problem. At the same time, the logs claim, that there was no timing overlap. Anyway, with the confirmation that the 150 ... line comes directly from the server, there's no doubt that the concurrent access was the culprit.


Solution

  • The message

    < 150 Opening BINARY mode data connection for /OUT/somefile.xml (2569402 bytes)
    

    is coming from the server, i.e. the server that is serving the file only sees a files with a length of 2569402 bytes. The most likely reason - without knowing the actual systems involved here - is that the file you try to download is currently been created. That's why it's working a couple of minutes later because the file creation is finished then.

    This is a common problem with different solutions out there:

    • The creation of a lock-file of the same name with a defined prefix or suffix that you can check for existence and only perform the download if it's not existent.
    • Checking the size for some time and only attempt a download if the size doesn't change for a while.
    • Create the file with a temporary name or into a different directory and rename/move it to the target name and destination

    As I said, I don't think that it's a problem with your client, the server clearly reports the wrong length so the reason for the problem must be there.