Search code examples
bashfilenames

Modify filenames with patterns from another file


I have these filenames :

demux.16S_For_bc1005--16S_Rev_bc1033.bam 
demux.16S_For_bc1005--16S_Rev_bc1044.bam 
demux.16S_For_bc1005--16S_Rev_bc1045.bam
demux.16S_For_bc1005--16S_Rev_bc1054.bam

I have this csv file :

16S_For_bc1005;16S_Rev_bc1033;Pa_32_S2_Rp
16S_For_bc1005;16S_Rev_bc1035;Pa_29_S2_Rp
16S_For_bc1005;16S_Rev_bc1044;Pa_15_S2_Rp
16S_For_bc1005;16S_Rev_bc1045;Pa_13_S2_Rp
16S_For_bc1005;16S_Rev_bc1054;Pa_25_S2_Rp
16S_For_bc1005;16S_Rev_bc1056;Pa_12_S2_Rp

I need to add the pattern (3rd column) of the csv file to the corresponding filenames, to get :

demux.16S_For_bc1005--16S_Rev_bc1033.Pa_32_S2_Rp.bam 
demux.16S_For_bc1005--16S_Rev_bc1044.Pa_15_S2_Rp.bam 
demux.16S_For_bc1005--16S_Rev_bc1045.Pa_13_S2_Rp.bam
demux.16S_For_bc1005--16S_Rev_bc1054.Pa_25_S2_Rp.bam

The trick is to match the good pattern with the good filename : it is based on the 1st and 2nd column of the csv. It may happen some lines of the csv file do not match with any filename (16S_For_bc1005;16S_Rev_bc1035;Pa_29_S2_Rp for example, because there is not the filename demux.16S_For_bc1005--16S_Rev_bc1035.bam)

I try to deal with this topic first https://unix.stackexchange.com/questions/229858/renaming-files-using-list

Best


Solution

  • Using an awk that can print NUL chars, e.g. GNU awk:

    $ cat tst.sh
    #!/usr/bin/env bash
    
    while IFS=$'\n' read -d '' -r old new; do
        echo mv -- "$old" "$new"
    done < <(
        printf '%s\n' demux.* |
        awk '
            BEGIN { FS=";" }
            NR==FNR {
                old = "demux." $1 "--" $2 ".bam"
                new = "demux." $1 "--" $2 "." $3 ".bam"
                map[old] = new
                next
            }
            $0 in map {
                printf "%s\n%s\0", $0, map[$0]
            }
        ' 'foo.csv' -
    )
    

    $ ./tst.sh
    mv -- demux.16S_For_bc1005--16S_Rev_bc1033.bam demux.16S_For_bc1005--16S_Rev_bc1033.Pa_32_S2_Rp.bam
    mv -- demux.16S_For_bc1005--16S_Rev_bc1044.bam demux.16S_For_bc1005--16S_Rev_bc1044.Pa_15_S2_Rp.bam
    mv -- demux.16S_For_bc1005--16S_Rev_bc1045.bam demux.16S_For_bc1005--16S_Rev_bc1045.Pa_13_S2_Rp.bam
    mv -- demux.16S_For_bc1005--16S_Rev_bc1054.bam demux.16S_For_bc1005--16S_Rev_bc1054.Pa_25_S2_Rp.bam
    

    Remove the echo when done testing. With any other awk, just print each file name on separate lines and modify the while read loop to while IFS= read -r file; do and do the mv on every second line read.