Search code examples
bashparallel-processingxargs

How to feed a list of bash commands to xargs to run in parallel?


I would like to run several instances of a program on a varying number of input file in parallel. The program itself is not parallellised, this is why I am looking for a way to submit multiple instances. I'm aware of GNU parallel, however the bash script that I'm writing will be shared with my co-workers, and not all of them have it installed.

I found an answer that almost matches my needs here, however the number of processes there are hardcoded, so I can't use a here document. In my case, there will be a different number of input files, so I thought I can list them and then feed to xargs to execute them. I tried various ways but neither of them worked. Two of my attemps to modify the code from the link:

#!/bin/bash
nprocs=3
# Attempt one: use a loop
commands=$( for ((i=0; i<5; i++)); do echo "sleep $i; echo $i;"; done )
echo Commands:
echo $commands
echo
{
    echo $commands | xargs -n 1 -P $nprocs -I {} sh -c 'eval "$1"' - {}
} &
echo "Waiting for commands to finish..."
wait $!

# Attempt two: use awk, the rest as above
commands=$( awk 'BEGIN{for (i=1; i<5; i++) { printf("sleep %d && echo \"ps %d\";\n", i, i) }}' )

The commands are executed one after the other. What could be wrong? Thanks.


Solution

  • Try running just

    xargs -n 1
    

    to see what commands are being run.

    To avoid problems with quoting, I'd use an array of commands.

    #! /bin/bash
    nprocs=3
    
    commands=()
    for i in {0..4} ; do
        commands+=("sleep 1; echo $i")
    done
    
    echo Commands:
    echo "${commands[@]}"
    
    printf '%s\n' "${commands[@]}" \
    | xargs -n 1 -P $nprocs -I % bash -c % &
    
    echo "Waiting for commands to finish..."
    wait $!