I have a brew installed gnu parallel in MacStudio;
(base) ~ % brew info parallel
==> parallel: stable 20221222 (bottled), HEAD
Shell command parallelization utility
https://savannah.gnu.org/projects/parallel/
(base) ~ % parallel --version
GNU parallel 20221222
Copyright (C) 2007-2022 Ole Tange, http://ole.tange.dk and Free Software
Foundation, Inc.
I played pipe-part
with sed
as follows;
parallel -a -eta -vv SRR8758324_2.fastq -k --block 30M --pipe-part 'sed "s/+.*/+/"' > SRR8758324_2.mod.fastq
When I monitored a cpu usage by htop, I could see all the 20 cores light up.
However, when I fired a following;
parallel -j 20 --eta -vv 'sed "s/+.*/+/"' ::: SRR8758324_2.fastq > SRR8758324_2.mod.fastq
only a single core was used. I'd really appreciate pointers to what I am missing.
'c.
,xNMM.
.OMMMMo OS: macOS 13.1 22C65 arm64
OMMM0, Host: Mac13,2
.;loddo:' loolloddol;. Kernel: 22.2.0
cKMMMMMMMMMMNWMMMMMMMMMM0: Uptime: 9 hours, 19 mins
.KMMMMMMMMMMMMMMMMMMMMMMMWd. Packages: 181 (brew)
XMMMMMMMMMMMMMMMMMMMMMMMX. Shell: zsh 5.9
;MMMMMMMMMMMMMMMMMMMMMMMM: Resolution: 3440x1440
:MMMMMMMMMMMMMMMMMMMMMMMM: DE: Aqua
.MMMMMMMMMMMMMMMMMMMMMMMMX. WM: Quartz Compositor
kMMMMMMMMMMMMMMMMMMMMMMMMWd. WM Theme: Blue (Light)
.XMMMMMMMMMMMMMMMMMMMMMMMMMMk Terminal: iTerm2
.XMMMMMMMMMMMMMMMMMMMMMMMMK. Terminal Font: Monaco 10
kMMMMMMMMMMMMMMMMMMMMMMd CPU: Apple M1 Ultra
;KMMMMMMMWXXWMMMMMMMk. GPU: Apple M1 Ultra
.cooc,. .,coo:. Memory: 3099MiB / 131072MiB
You're just running one instance of sed
on one file in the second example.
In the first one, because of --pipe-part
, you are asking GNU Parallel to split the file into chunks and process each one in a new job.