Search code examples
sedcatxargs

How do I get and format the contents of a bunch of text files (md5sums) in Linux?


I have a bunch of md5 files, which have a hash and a path/filename. I want to output the hash and filename, but not the path.

Example file contents:

d7b7494a0602d91b391848dfaebec7ac  /home/develop/md5sums/file1.md5

Desired output:

d7b7494a0602d91b391848dfaebec7ac file1.md5
dd036a1e1c16b3488309a75f80e5eb92 file2.md5
bf60fb26cbe1f4e9e4001aa485b89ff8 file3.md5

My current attempts from searches so far are:

cat *.md5 | xargs -0 sed -i 's/\/home\/develop\/md5sums\///g'

(Gives an error e.g. for file3.md5: sed: 2: "bf60fb26cbe1f4e9e4001aa ...": extra characters at the end of d command)

cat *.md5 | xargs -I '{}' sed -i 's/\/home\/develop\/md5sums\////g'

(Gives an error e.g. for file3.md5: sed: -I or -i may not be used with stdin)

I can probably figure out how to solve it with a for loop, but ideally I'd like to keep it as a piped one-liner if possible, and I think there should be a way for cat/xargs/sed to work, I just can't figure it out!

The path is hardcoded, so I don't feel the need to use basename, particularly as the md5 file contains more than just the the path/file, which I think makes it more tricky!


Solution

  • You can use this:

    sed -E 's:([0-9a-f]+)[[:space:]]+/(.+/)*([^/]+)$:\1 \3:g' file.md5
    

    Here we capturing hash, path and last segment of a path and replacing it with hash (\1) and last segment (\3).