I am using GNU parallel, and it is running jobs in parallel, but on way fewer than the total number of threads I told it to use with -j
.
I ran it this way:
cat untar_my_folders.jobfile | parallel -j 60
and the jobfile is very simple, it just has ~ 500 lines that look like this:
tar xvf myfolder.tar
tar xvf myfolder.tar
tar xvf myfolder.tar
tar xvf myfolder.tar
tar xvf myfolder.tar
I checked, and parallel does recognize all of the processors on the server:
$ parallel --number-of-cores
80
But when I use top
I can see that it is only running ~20 jobs at once.
Thank you for any suggestions!
[edit] The version and OS info:
I see this all the time, and @Tinmarino points to the issue, but does not explain why.
top
shows processes that take up a lot of CPU time. tar
, however, does not. tar
takes very little CPU time, but a lot of disk I/O, and top
does not show this.
iotop
can help or iostat -dkx 1
. I have a bash function:
IO() {
string="${1:-sd}";
iostat -dkx 1 | perl -ne 'BEGIN { $| = 1; $string = shift }
s/(........)(\s+\S+)(\s+\S+)(\s+\S+)(\s+\S+)(\s+\S+)(\s+\S+)(\s+\S+)(\s+\S+)(\s+\S+)(\s+\S+)(\s+\S+)(\s+\S+)(\s+\S+)(\s+\S+)(\s+\S+)(\s+\S+)(\s+\S+)(\s+\S+)(\s+\S+)(\s+\S+)/$1$3$9$21/
, ||
s/(........)(\s+\S+)(\s+\S+)(\s+\S+)(\s+\S+)(\s+\S+)(\s+\S+)(\s+\S+)(\s+\S+)(\s+\S+)(\s+\S+)(\s+\S+)(\s+\S+)(\s+\S+)(\s+\S+)(\s+\S+)/$1$4$5$16/
||
s/(........)(\s+\S+)(\s+\S+)(\s+\S+)(\s+\S+)(\s+\S+)(\s+\S+)(\s+\S+)(\s+\S+)(\s+\S+)(\s+\S+)(\s+\S+)(\s+\S+)(\s+\S+)/$1$6$7$14/;
/Device/ and print and next;
m^$string^ and print;
' $string
}
This way I can run IO
to see all devices called 'sd' or IO sda
to only show how busy /dev/sda is.
My guess is that you will see that IO
will show at least one disk maxing out, and that ps -aux | grep "tar xvf"
will show the correct amount of jobs running - most of them waiting for disk I/O.
How to improve this: copy all your files to a RAM disk. In that case your CPU will be waiting less for disk I/O:
top - 10:10:45 up 7 min, 3 users, load average: 72.80, 54.00, 27.52
Tasks: 1229 total, 82 running, 1147 sleeping, 0 stopped, 0 zombie
%Cpu(s): 4.1 us, 94.9 sy, 0.0 ni, 0.3 id, 0.0 wa, 0.0 hi, 0.8 si, 0.0 st
GiB Mem : 503.9 total, 1.6 free, 24.9 used, 477.4 buff/cache
GiB Swap: 200.0 total, 184.2 free, 15.8 used. 5.4 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
493 root 20 0 0 0 0 R 98.2 0.0 0:51.74 kswapd3
334367 tange 20 0 8152 3192 2996 R 92.7 0.0 0:13.70 tar
334717 tange 20 0 8152 3148 2956 R 92.4 0.0 0:12.28 tar
334143 tange 20 0 8152 3176 2984 R 92.0 0.0 0:13.11 tar
334114 tange 20 0 8152 3172 2976 R 90.5 0.0 0:13.41 tar
334285 tange 20 0 8152 1276 1144 R 89.9 0.0 0:13.72 tar
334701 tange 20 0 8152 3188 2996 R 89.0 0.0 0:13.50 tar
2316 root 20 0 6976856 840240 823248 S 88.7 0.2 5:45.89 containerd
334632 tange 20 0 8152 3168 2976 R 87.2 0.0 0:12.52 tar
334803 tange 20 0 8152 1244 1112 R 86.9 0.0 0:13.99 tar
334368 tange 20 0 8152 3168 2976 R 86.2 0.0 0:13.25 tar
334419 tange 20 0 8152 3172 2976 R 86.2 0.0 0:13.73 tar
334499 tange 20 0 8152 3152 2956 R 84.7 0.0 0:12.15 tar
334433 tange 20 0 8152 3152 2956 R 84.4 0.0 0:12.75 tar
334483 tange 20 0 8152 3152 2960 R 84.1 0.0 0:13.23 tar
334082 tange 20 0 8152 3152 2956 R 83.8 0.0 0:12.80 tar
334653 tange 20 0 8152 3188 2996 R 83.5 0.0 0:12.87 tar
334728 tange 20 0 8152 3156 2960 R 83.5 0.0 0:12.80 tar
334404 tange 20 0 8152 3164 2972 R 83.2 0.0 0:12.44 tar
334206 tange 20 0 8152 3184 2988 R 82.6 0.0 0:13.17 tar
334432 tange 20 0 8152 3128 2932 R 82.6 0.0 0:12.41 tar
334100 tange 20 0 8152 3160 2968 R 82.3 0.0 0:13.77 tar
334315 tange 20 0 8152 3160 2964 R 82.3 0.0 0:12.55 tar
334587 tange 20 0 8152 3148 2956 R 82.3 0.0 0:12.22 tar
334759 tange 20 0 8152 3148 2956 R 81.7 0.0 0:12.60 tar
334078 tange 20 0 8152 1240 1112 R 81.3 0.0 0:13.37 tar
334294 tange 20 0 8152 1244 1112 R 81.0 0.0 0:13.36 tar
334434 tange 20 0 8152 3148 2956 R 80.7 0.0 0:13.28 tar