Search code examples
bashparallel-processinggnugnu-parallel

How to use GNU Parallel


How do I use GNU parallel with aws sync command?

I have a file with the following commands:

aws s3 cp ./test s3://test --recursive --content-encoding "gzip" --content-type "text/html" --cache-control "max-age=$MAXAGE" --exclude "*" --include "*.html" --profile $PROFILE

aws s3 cp ./test s3://test  $S3BUCKET --recursive --content-encoding "gzip" --content-type "text/html" --cache-control "max-age=$MAXAGE" --exclude "*" --include "*.css" --profile $PROFILE

How can I use GNU parallel to run these commands in parallel?

What I did was add the commands in a file called test.sh

and I run the following command

parallel < test.sh

] How do I pass in arguments to the test.sh file? For example, I want to pass in the aws bucket name.


Solution

  • If your goal is to trigger a script failure if any member of a set of hand-written commands fails, GNU parallel isn't the best tool for the job: The shell itself already provides everything needed with the wait command, which is specified by POSIX and present out-of-the-box on all standards-compliant shells (see also the specification requiring it to be implemented as a builtin).

    #!/bin/bash
    #      ^^^^- Important! /bin/sh doesn't have arrays; bash, ksh, or zsh will work.
    
    # For readability, put common arguments in an array
    common_args=(
      --recursive
      --content-encoding "gzip"
      --content-type "text/html"
      --cache-control "max-age=$MAXAGE"
      --exclude "*"
      --profile "$PROFILE"
    )
    
    # Record PIDs of the various jobs in an array
    pids=( )
    aws s3 cp ./test s3://test             --include='*.html' "${common_args[@]}" & pids+=( $! )
    aws s3 cp ./test s3://test "$S3BUCKET" --include='*.css'  "${common_args[@]}" & pids+=( $! )
    
    # If either background job failed, exit the script with the same exit status
    for pid in "${pids[@]}"; do
      wait "$pid" || exit
    done
    

    Note that arrays are used above for convenience, not necessity; you could provide the common arguments with a function, and/or build up the array of PIDs in a scalar variable or by overriding "$@" inside a shell function if your goal were to write code that would work on any POSIX baseline shell.