Search code examples
bashgnu-parallel

tracking status/progress in gnu parallel


I've implemented parallel in one of our major scripts to perform data migrations between servers. Presently, the output is presented all at once (-u) in pretty colors, with periodic echos of status from the function being executed depending on which sequence is being run (e.g. 5/20: $username: rsyncing homedir or 5/20: $username: restoring account). These are all echoed directly to the terminal running the script, and accumulate there. Depending on the length of time a command is running, however, output can end up well out of order, and long running rsync commands can be lost in the shuffle. Butm I don't want to wait for long running processes to finish in order to get the output of following processes.

In short, my issue is keeping track of which arguments are being processed and are still running.

What I would like to do is send parallel into the background with (parallel args command {#} {} ::: $userlist) & and then track progress of each of the running functions. My initial thought was to use ps and grep liberally along with tput to rewrite the screen every few seconds. I usually run three jobs in parallel, so I want to have a screen that shows, for instance:

1/20: user1: syncing homedir
current file: /home/user1/www/cache/file12589015.php

12/20: user12: syncing homedir
current file: /home/user12/mail/joe/mailfile

5/20: user5: collecting information
current file: 

I can certainly get the above status output together no problem, but my current hangup is separating the output from the individual parallel processes into three different... pipes? variables? files? so that it can be parsed into the above information.


Solution

  • Not sure if this is much better:

    echo hello im starting now
    sleep 1
    # start parallel and send the job to the background
    temp=$(mktemp -d)
    parallel --rpl '{log} $_="Working on@arg"' -j3 background {} {#} ">$temp/{1log} 2>&1;rm $temp/{1log}" ::: foo bar baz foo bar baz one two three one two three :::+ 5 6 5 3 4 6 7 2 5 4 6 2 &
    while kill -0 $!  2>/dev/null ; do
        cd "$temp"
        clear
        tail -vn1 *
        sleep 1
    done
    rm -rf "$temp"
    

    It make a logfile for each job. Tails all logfiles every second and removes the logfile when a jobs is done.

    The logfiles are named 'working on ...'.