Search code examples
bashgnu-parallelmd5sum

parallel check md5 file


I have a md5sum file containing lots of lines. I want to use GNU parallel to accelerate the md5sum checking process. In the md5sum, when no file input, it will take the md5 string from stdin. I tried this:

cat checksums.md5 | parallel md5sum -c {}

But getting this error:

md5sum 445350b414a8031d9dd6b1e68a6f2367 testing.gz: No such file or directory

How can I parallel the md5sum checking?


Solution

  • Assuming checksums.md5 has the format:

    d41d8cd98f00b204e9800998ecf8427e  My file name
    

    Run:

    cat checksums.md5 | parallel --pipe -N1 md5sum -c
    

    If your files are small: -N100

    If that does not speed up your processing make sure your disks are fast enough: md5sum can process 500 MB/s. iostat -dkx 1 can tell you if your disks are a bottleneck.