I'm executing the following command which executes a group of scripts with each script being a curl download.
parallel --resume-failed --joblog logshd.log {1} ::: SH/*.sh
The set of files downloaded is quite large. I've noticed some files don't download.
I hoped that the resume-failed parameter would ensure that all the downloads that fail resume and complete.
From the gnu documentation
Where --resume-failed reads the commands from the command line (and ignores the commands in the joblog), --retry-failed ignores the command line and reruns the commands mentioned in the joblog.
I'm not clear on what ignoring the command line or ignores the commands in the job log means. Could that be clarified.
Can --resume-failed and --retry-failed be declared within the same command and if so what is the effect of that?
Regards Conteh
If we assume the download fails intermittently then your answer is --retries 10
. It will run the command 10 times before giving up.
--resume-failed
and --retry-failed
are both used when GNU Parallel has finished, and you then figure out that you want to retry some of the jobs again.
The difference between the two is in how to retry the command.
--retry-failed
will run exactly the same command as failed before. It does that by looking in the joblog for the command. This is typically what you want.--resume-failed
is used if you figure out that the failing command actually needed some other parameter: i.e. GNU Parallel should not run exactly the same command, but it should run a (typically slightly changed) command with the same parameters instead.