Search code examples
debianwgetstretch

Wget retrying server errors after 30 mins, not 10 seconds


I am running a simple bash script on a Debian Stretch server nightly to backup several industrial devices at different sites over VPNs. As these devices age, they seem to develop a problem where they will connect via FTP but refuse to provide a directory listing. This causes wget to generate a non-fatal error.

I would expect wget to follow the option for --waitretry to deal with this error (from the man page):

--waitretry=seconds
If you don't want Wget to wait between every retrieval, but only between retries of failed downloads, you can use this option. Wget will use linear backoff, waiting 1 second after the first failure on a given file, then waiting 2 seconds after the second failure on that file, up to the maximum number of seconds you specify. By default, Wget will assume a value of 10 seconds.

However, instead of waiting the default 10 seconds, it is waiting 30 mins between retries. It was also retrying the default 20 times, delaying the backup for hours. I searched the man page, wget online docs, local /etc/wgetrc, and can find no options related to 30 mins or 1800 seconds.

Here are two consecutive attempts showing the 30 min delay and responses:

--2021-11-26 03:08:52--  ftp://user:*password*@192.168.18.4/SD1/
(try: 2) => ‘192.168.18.4/SD1/.listing’  
Connecting to 192.168.18.4:21... connected.  
Logging in as user ... Logged in!  
==> SYST ... done.    ==> PWD ... done.  
==> TYPE I ... done.  ==> CWD (1) /SD1 ... done.  
==> PORT ... done.    ==> LIST ...  
Error in server response, closing control connection.  
Retrying.

--2021-11-26 03:38:55--  ftp://user:*password*@192.168.18.4/SD1/
(try: 3) => ‘192.168.18.4/SD1/.listing’  
Connecting to 192.168.18.4:21... connected.  
Logging in as user ... Logged in!  
==> SYST ... done.    ==> PWD ... done.  
==> TYPE I ... done.  ==> CWD (1) /SD1 ... done.  
==> PORT ... done.    ==> LIST ...  
Error in server response, closing control connection.  
Retrying.

I added both a --tries=5 and a --waitretry=30 option to the command line to try to override the 30 min delay:

wget --waitretry=30 -t 5 -m --no-passive -o ftplog ftp://user:[email protected]/SD1/

The tries option did override the default 20 tries limiting it to 5 tries, but the waitretry remains at 30 mins. Anyone know why this is happening?


Solution

  • I have found an answer although it doesn't explain why wget seems to ignore some options.

    I had another device that went offline this morning. As you can see the errors are different than my original post (connection timeouts) and the delay between retries appears to be two minutes + 12-13 seconds. The wget command for this device has no waitretry / timeout options specified:

    --2021-12-06 02:49:23--  ftp://user:*password*@192.168.40.3/SD1/
               => ‘192.168.40.3/SD1/.listing’
    Connecting to 192.168.40.3:21... failed: Connection timed out.
    Retrying.
    
    --2021-12-06 02:51:35--  ftp://user:*password*@192.168.40.3/SD1/
      (try: 2) => ‘192.168.40.3/SD1/.listing’
    Connecting to 192.168.40.3:21... failed: Connection timed out.
    Retrying.
    

    Since --waitretry had no effect, I changed this option to --read-timeout for the device with invalid server responses:

    wget --read-timeout=60 -t 5 -m --no-passive -o ftplog ftp://user:[email protected]/SD1/
    

    Now, the retries are back to 2 minutes (not the 60 seconds specified) plus the linear backoff specified in the wget docs for --waitretry. This doesn't answer my original question, but it prevents the retries from taking 30 minutes each which is an acceptable workaround.