Search code examples
variablesunixrenamenamingfile-rename

Batch renaming files with text from file as a variable


I am attempting to convert the files with the titles {out1.hmm, out2.hmm, ... , outn.hmm} to unique identifiers based on the third line of the file {PF12574.hmm, PF09847.hmm, PF0024.hmm} The script works on a single file however the variable does not get overwritten and only one file remains after running the command below:

for f in *.hmm; do output="$(sed -n '3p' < $f | awk -F ' ' '{print $2}' | cut -f1 -d '.' | cat)" | mv $f "${output}".hmm; done;

The first line calls all the outn.hmms as an input. The second line sets a variable to return the desired unique identifier. SED, AWK, and CUT are used to get the unique identifier. The variable supposed to rename the current file by the unique identifier, however the variable remains locked and overwrites the previous file.

out1.hmm out2.hmm out3.hmm becomes PF12574.hmm

How can I overwrite the variable to get the following file structure:

out1.hmm out2.hmm out3.hmm becomes PF12574.hmm PF09847.hmm PF0024.hmm


Solution

  • You're piping the empty output of the assignment statement (to the variable named "output") into the mv command. That variable is not set yet, so what I think will happen is that you will - one after the other - rename all the files that match *.hmm to the file named ".hmm".

    Try ls -a to see if that's what actually happened.

    The sed, awk, cut, and (unneeded) cat are a bit much. awk can do all you need. Then do the mv as a separate command:

    for f in *.hmm
    do
      output=$(awk 'NR == 3 {print $2}' "$f")
      mv "$f" "${output%.*}.hmm"
    done
    

    Note that the above does not do any checking to verify that output is assigned to a reasonable value: one that is non-empty, that is a proper "identifier", etc.