I'm ultimately trying to use parallel as a simple job queue manager, a la here. The idea seems to be to put the commands in a file, have tail read the file (using -f option so that it keeps looking for new lines), then pipe the output of tail into parallel. So I try
true > jobqueue; tail -n+0 -f jobqueue | parallel
echo echo {} ::: a b c >> jobqueue
but nothing happens. OK... to test things, I then just try
cat jobqueue | parallel
which gives
{} ::: a b c
Meanwhile
parallel echo {} ::: a b c
correctly outputs
a
b
c
So why does parallel ignore the parallel-ish syntax when it was fed from a file, but runs fine when it's given the command directly?
FWIW this is version 20160722, and since I don't have root access on the machine I had to build from source and install into my home directory.
So why does parallel ignore the parallel-ish syntax when it was fed from a file, but runs fine when it's given the command directly?
Because that's what it is specified to do. What you're characterizing as "syntax" is defined in the manual as various command-line arguments and parts thereof. These seem mostly targeted at the case where the the command to parallelize is given on parallel
's command line, and the program input consists of data to operate upon. This is the mode of operation of the xargs
program, which was one of the inspirations for parallel
.
The fact is, you're making things more complicated than they need to be. When you run parallel
without specifying a command on its command line, the commands you feed it via its input don't need the kind of input-line manipulation operations that parallel
itself offers, and they can't, in general, take arguments any other way than on their own command line. When you run parallel
in that mode, you just feed it the exact commands you want it to run:
true > jobqueue; tail -n+0 -f jobqueue | parallel
echo echo a b c >> jobqueue
or
true > jobqueue; tail -n+0 -f jobqueue | parallel
echo echo a >> jobqueue
echo echo b >> jobqueue
echo echo c >> jobqueue
, depending on what exactly you're after.
As for nothing seeming to happen when you use tail -f
to feed input to parallel
, I'm inclined to think that parallel
is waiting for more input. Its first read(s) does not return enough data to trigger it to dispatch any jobs, but the standard input is still open, so it has reason to think that more input will be coming (which indeed is appropriate). If you continue to feed it jobs then it will soon get enough input to start running them. When you're ready to shut down the queue you must kill
the tail
command so that parallel
will know that it has reached the end of its input.