Search code examples
bashprocesssleeppidtail

Tail vs sleep to wait on process


Currently I do:

while [ -d "/proc/$PID"  ]; do
  sleep 1
done

To wait for a process to exit. If I would replace it with:

tail --pid=$PID -f /dev/null

Would that be more efficient for the CPU? Or does tail just use the same polling under the hood?


Solution

  • Assuming $PID runs for 10 secs ...

    • the while loop conducts 10 tests and invokes the sleep binary 10 times
    • the alternatives invoke a single tail binary or a single wait builtin

    I'd expect the tail and wait operations to be more 'efficient'.

    Short of reviewing the source code for tail we can run some simple tests ...

    $ cat testme.1
    #!/bin/bash
    sleep $1 &
    pid=$!
    while [ -d /proc/$pid ]; do sleep 1; done
    
    $ cat testme.2
    #!/bin/bash
    sleep $1 &
    pid=$!
    tail --pid $pid -f /dev/null
    
    $ cat testme.3
    #!/bin/bash
    sleep $1 &
    wait
    

    Timings for a 10 sec test:

    $ /usr/bin/time testme.1 10
    0.07user 0.21system 0:10.36elapsed 2%CPU (0avgtext+0avgdata 98392 maxresident)k
      ^^       ^^                      ^^^^^                    ^^^^^
    0inputs+0outputs (26228 major+0minor)pagefaults 0swaps
                      ^^^^^
    
    $ /usr/bin/time testme.2 10
    0.01user 0.03system 0:10.11elapsed 1%CPU (0avgtext+0avgdata 27060 maxresident)k
      ^^       ^^                      ^^^^^                    ^^^^^
    0inputs+0outputs (7207 major+0minor)pagefaults 0swaps
                      ^^^^
    
    $ /usr/bin/time testme.3 10
    0.01user 0.01system 0:10.05elapsed 0%CPU (0avgtext+0avgdata 18484 maxresident)k
      ^^       ^^                      ^^^^^                    ^^^^^
    0inputs+0outputs (4904 major+0minor)pagefaults 0swaps
                      ^^^^
    

    NOTES:

    • tests run under cygwin/bash v.4.4.12 in a Win10 VM running on i7-1260P
    • testme.1 - 10x runs showed 5x each for 2%CPU and 3%CPU (call it 2.5% CPU)
    • testme.2 - 10x runs showed 9x 0%CPU and 1x 1%CPU (call it 0.1% CPU)
    • testme.3 - 10x runs showed 9x 0%CPU and 1x 1%CPU (call it 0.1% CPU)
    • obviously testme.3 (aka wait) is not a valid option if waiting for some other process's child process to complete

    The overhead for the while/sleep loop increases noticeably with time while we see minimal-to-no increases in overhead for the tail and wait options; eg, 60 sec tests:

    $ /usr/bin/time testme.1 60
    0.34user 1.16system 1:01.02elapsed 2%CPU (0avgtext+0avgdata 488804 maxresident)k
      ^^     ^^^^                      ^^^^^                    ^^^^^^
    0inputs+0outputs (131123 major+0minor)pagefaults 0swaps
                      ^^^^^^
    
    $ /usr/bin/time testme.2 60
    0.03user 0.07system 1:00.34elapsed 0%CPU (0avgtext+0avgdata 26356 maxresident)k
      ^^       ^^                      ^^^^^                    ^^^^^
    0inputs+0outputs (7200 major+0minor)pagefaults 0swaps
                      ^^^^
    
    $ /usr/bin/time testme.3 60
    0.00user 0.03system 1:00.06elapsed 0%CPU (0avgtext+0avgdata 18328 maxresident)k
      ^^       ^^                      ^^^^^                    ^^^^^
    0inputs+0outputs (4917 major+0minor)pagefaults 0swaps
                      ^^^^