Search code examples
language-agnostic

How to replace all the values in one file with their corresponding names from another file


I need help in substituting values (numbers) in one file (File-1.txt) with those of names (string) in another file (File-2.txt) containing the values in first column and corresponding names in the second column. The two files are as below:

File-1.txt

(0, ':', [])
(1, ':', [18])
(2, ':', [20, 41])
(3, ':', [18])
(4, ':', [17])
(5, ':', [])
(6, ':', [])

File-2.txt

1   ALA_A_87
2   THR_A_127
3   GLY_A_128
4   ILE_A_130
5   THR_A_166
6   THR_A_167

Expected output:

(ALA_A_87, ':', [THR_A_127])
(THR_A_127, ':', [ALA_A_87, GLY_A_128, ILE_A_130])
(GLY_A_128, ':', [THR_A_127, 5])
(ILE_A_130, ':', [THR_A_127])
(THR_A_166, ':', [GLY_A_128, THR_A_167])
(THR_A_167, ':', [THR_A_166])

I have been trying some awk code, but no luck!! The awk code is as below:

awk -F, '
    BEGIN {
        subs="ALA_A_87 THR_A_127 GLY_A_128 ILE_A_130 THR_A_166 THR_A_167";
        split( subs, subs_arr );
    }
    NR == 1 { 
        print; 
        next 
    } 
    NR>1{
        for i in {1 2 3 4 5 6 }; {
            $i = subs_arr[ $i++ ];
        }
        print
    }
' File-1.txt

Here, I provided the names and values to be substituted in the code itself, but if it can be provided as an external file (File-2.txt) and matching the numbers in the File-1.txt with that of the first column of File-2.txt and replace the numbers in File-1.txt with the corresponding names shown in File-2.txt. Help me resolve this code.

Thanks in advance


Solution

  • This awk one-liner should help:

    awk 'NR==FNR{a[$1]=$2;next}{FS="[(,]"}sub(/[^,]*/,"("a[$2])+1' f2 f1
    

    It outputs something like:

    (, ':', [])
    (ALA_A_87, ':', [18])
    (THR_A_127, ':', [20, 41])
    (GLY_A_128, ':', [18])
    (ILE_A_130, ':', [17])
    (THR_A_166, ':', [])
    (THR_A_167, ':', [])
    (LYS_A_189, ':', [])
    (GLY_A_190, ':', [45])
    (ALA_A_191, ':', [])
    (GLY_A_192, ':', [26])
    (MET_A_193, ':', [23])
    (LEU_A_194, ':', [])
    (THR_B_200, ':', [37])
    .....
    

    Note the first line, the 0 doesn't exist in file2, so there is an empty in the (), if this is not what you want, please let me know what value should be there.