Using the Apache FTP client from commons-net:commons-net:3.6
, I'm reading files from our FTPS server with
FTPClient#retrieveFile("/OUT/somefile.xml", someBAOS)
Normally, everything works well, but sometimes the file gets truncated.
This is the protocol when everything works:
< 220 ProFTPD 1.3.5a Server (someserver) [::ffff:...]
> AUTH TLS
< 234 AUTH TLS successful
> PBSZ 0
< 200 PBSZ 0 successful
> PROT P
< 200 Protection set to Private
> USER someuser
< 331 Password required for someuser
> PASS ***
< 230 User someuser logged in
> TYPE I
< 200 Type set to I
> PASV
< 227 Entering Passive Mode (...).
> RETR /OUT/somefile.xml
< 150 Opening BINARY mode data connection for /OUT/somefile.xml (4769503 bytes)
< 226 Transfer complete
When the file got truncated, the smaller size was logged:
< 150 Opening BINARY mode data connection for /OUT/somefile.xml (2569402 bytes)
The truncation happens sporadically. On the next download, one hour later, everything is fine again. We're pretty sure that the file hasn't changed during this period.
The log file was produced using SocketClient#addProtocolCommandListener
and I'm pretty sure that the changed size doesn't come from my listener. I guess, the text was produced by the FTP server and dumped as is. Can someone confirm that the file size indeed comes from the server (rather than being added by the Apache client)?
Funnily, the downloaded truncated file has 2602133 bytes (and I'm rather sure, there are no \r
s added by text conversion or alike; Firstly, we do conversion; secondly, the difference is 31371 bytes and there are 56577 lines there).
The most probable explanation would be someone changing the file in the meantime, but the server logs clearly say that there was nobody else at the time.
Any idea how to find out what's going on?
I've got some more logs showing clearly that there were uploads around the time of the problem. At the same time, the logs claim, that there was no timing overlap. Anyway, with the confirmation that the 150 ...
line comes directly from the server, there's no doubt that the concurrent access was the culprit.
The message
< 150 Opening BINARY mode data connection for /OUT/somefile.xml (2569402 bytes)
is coming from the server, i.e. the server that is serving the file only sees a files with a length of 2569402 bytes. The most likely reason - without knowing the actual systems involved here - is that the file you try to download is currently been created. That's why it's working a couple of minutes later because the file creation is finished then.
This is a common problem with different solutions out there:
As I said, I don't think that it's a problem with your client, the server clearly reports the wrong length so the reason for the problem must be there.