I am attempting to convert the files with the titles {out1.hmm, out2.hmm, ... , outn.hmm} to unique identifiers based on the third line of the file {PF12574.hmm, PF09847.hmm, PF0024.hmm} The script works on a single file however the variable does not get overwritten and only one file remains after running the command below:
for f in *.hmm;
do output="$(sed -n '3p' < $f |
awk -F ' ' '{print $2}' |
cut -f1 -d '.' | cat)" |
mv $f "${output}".hmm; done;
The first line calls all the outn.hmms as an input. The second line sets a variable to return the desired unique identifier. SED, AWK, and CUT are used to get the unique identifier. The variable supposed to rename the current file by the unique identifier, however the variable remains locked and overwrites the previous file.
out1.hmm out2.hmm out3.hmm becomes PF12574.hmm
How can I overwrite the variable to get the following file structure:
out1.hmm out2.hmm out3.hmm becomes PF12574.hmm PF09847.hmm PF0024.hmm
You're piping the empty output of the assignment statement (to the variable named "output") into the mv
command. That variable is not set yet, so what I think will happen is that you will - one after the other - rename all the files that match *.hmm to the file named ".hmm".
Try ls -a
to see if that's what actually happened.
The sed
, awk
, cut
, and (unneeded) cat
are a bit much. awk
can do all you need. Then do the mv
as a separate command:
for f in *.hmm
do
output=$(awk 'NR == 3 {print $2}' "$f")
mv "$f" "${output%.*}.hmm"
done
Note that the above does not do any checking to verify that output
is assigned to a reasonable value: one that is non-empty, that is a proper "identifier", etc.