I am trying to mirror a public FTP to a local directory. When I use wget -m {url}
then wget quite quickly skips lots of files that have been already downloaded (and no newer version exists), when I use lftp open -u user,pass {url}; mirror
then lftp
sends MDTM
for every file before deciding whether to download the file or not. With 2 million+ files in 50 thousand+ directories this is very slow, besides I get error messages that MDTM of directories could not be obtained.
In the manual it says that using set sync-mode off
will result in sending all requests at once, so that lftp
doesn't wait for each response. When I do that, I get error messages from the server saying there are too many connections from my IP address.
I tried running wget
first to download only the newer files, but this does not delete the files which were removed from the FTP server, so I follow up with lftp
to remove the old files, however lftp
still sends MDTM on each file, which means that there is no advantage to this approach.
If I use set ftp:use-mdtm off
, then it seems that lftp
just downloads all files again.
Could someone suggest the correct setting for lftp
with large number of directories/files (specifically, so that it skips directories which were not updated, like wget seems to do)?
Use set ftp:use-mdtm off
and mirror --ignore-time
for the first invocation to avoid re-downloading all the files.
You can also try to upgrade lftp and/or use set ftp:use-mlsd on
, in this case lftp will get precise file modification time from the MLSD command output (provided that the server supports the command).