I am using Python and the Twisted framework to connect to an FTP site to perform various automated tasks. Our FTP server happens to be Pure-FTPd, if that's relevant.
When connecting and calling the list method on an FTPClient, the resulting FTPFileListProtocol's files collection does not contain any directories or file names that contain a space (' ').
Has anyone else seen this? Is the only solution to create a sub-class of FTPFileListProtocol and override its unknownLine method, parsing the file/directory names manually?
Firstly, if you're performing automated tasks on a retrieived FTP listing then you should probably be looking at NLST
rather than LIST
as noted in RFC 959 section 4.1.3:
NAME LIST (NLST) ... This command is intended to return information that can be used by a program to further process the files automatically.
The Twisted documentation for LIST
says:
It can cope with most common file listing formats.
This make me suspicious; I do not like solutions that "cope". LIST
was intended for human consumption not machine processing.
If your target server supports them then you should prefer MLST
and MLSD
as defined in RFC 3659 section 7:
7. Listings for Machine Processing (MLST and MLSD) The MLST and MLSD commands are intended to standardize the file and directory information returned by the server-FTP process. These commands differ from the LIST command in that the format of the replies is strictly defined although extensible.
However, these newer commands may not be available on your target server and I don't see them in Twisted. Therefore NLST
is probably your best bet.
As to the nub of your problem, there are three likely causes:
NLST
/LIST
, but some servers react differently if arguments are supplied to these commands)You can eliminate (2) and (3) and prove that the cause is (1) by looking at what is sent over the wire. If this option is not available to you as part of the Twisted API or the Pure-FTPD server logging configuration, then you may need to break out a network sniffer such as tcpdump, snoop or WireShark (assuming you're allowed to do this in your environment). Note that you will need to trace not only the control connection (port 21) but also the data connection (since that carries the results of the LIST
/NLST
command). WireShark is nice since it will perform the protocol-level analysis for you.
Good luck.