Using Linux bash command line, I need to merge two filles integrating several copies of the file1 inside the specified part of the file 2. The file 1 looks like:
ATOM 1 N SER A 1 -2.390 4.343 -17.003 1.00 27.76 N1+
ATOM 2 CA SER A 1 -2.066 5.647 -16.370 1.00 27.12 C
ATOM 3 C SER A 1 -2.394 5.608 -14.874 1.00 26.29 C
ATOM 4 O SER A 1 -3.014 4.627 -14.405 1.00 22.93 O
ATOM 5 CB SER A 1 -2.771 6.798 -17.057 1.00 28.10 C
ATOM 6 OG SER A 1 -2.538 8.023 -16.373 1.00 32.02 O
ATOM 7 N GLY A 2 -1.982 6.655 -14.162 1.00 25.31 N
ATOM 8 CA GLY A 2 -2.172 6.779 -12.716 1.00 24.93 C
ATOM 9 C GLY A 2 -0.888 6.336 -12.067 1.00 23.66 C
ATOM 10 O GLY A 2 -0.168 5.459 -12.608 1.00 27.42 O
ATOM 11 N PHE A 3 -0.636 6.866 -10.900 1.00 22.07 N
ATOM 12 CA PHE A 3 0.622 6.595 -10.191 1.00 21.70 C
ATOM 13 C PHE A 3 0.279 6.570 -8.716 1.00 20.39 C
ATOM 14 O PHE A 3 -0.265 7.544 -8.167 1.00 23.83 O
the file 2 is a multi-block, where separate parts are defined by model1,model 2, model N and separated by ENDMDL:
MODEL 1
REMARK VINA RESULT: -7.828 0.000 0.000
REMARK INTER + INTRA: -13.769
REMARK INTER: -10.110
REMARK INTRA: -3.659
REMARK UNBOUND: -3.196
ENDMDL
MODEL 2
REMARK VINA RESULT: -7.828 0.000 0.000
REMARK INTER + INTRA: -13.769
REMARK INTER: -10.110
REMARK INTRA: -3.659
REMARK UNBOUND: -3.196
ENDMDL
MODEL 3
REMARK VINA RESULT: -7.828 0.000 0.000
REMARK INTER + INTRA: -13.769
REMARK INTER: -10.110
REMARK INTRA: -3.659
REMARK UNBOUND: -3.196
ENDMDL
I need to copy several times all the containt of the file 1 into the file 2 just before the separator ENDMDL
(in the second file), thus integrating several coppies of the file 1 into the file 2. Here is the example of expected output:
MODEL 1
REMARK VINA RESULT: -7.828 0.000 0.000
REMARK INTER + INTRA: -13.769
REMARK INTER: -10.110
REMARK INTRA: -3.659
REMARK UNBOUND: -3.196
ATOM 1 N SER A 1 -2.390 4.343 -17.003 1.00 27.76 N1+
ATOM 2 CA SER A 1 -2.066 5.647 -16.370 1.00 27.12 C
ATOM 3 C SER A 1 -2.394 5.608 -14.874 1.00 26.29 C
ATOM 4 O SER A 1 -3.014 4.627 -14.405 1.00 22.93 O
ATOM 5 CB SER A 1 -2.771 6.798 -17.057 1.00 28.10 C
ATOM 6 OG SER A 1 -2.538 8.023 -16.373 1.00 32.02 O
ATOM 7 N GLY A 2 -1.982 6.655 -14.162 1.00 25.31 N
ATOM 8 CA GLY A 2 -2.172 6.779 -12.716 1.00 24.93 C
ATOM 9 C GLY A 2 -0.888 6.336 -12.067 1.00 23.66 C
ATOM 10 O GLY A 2 -0.168 5.459 -12.608 1.00 27.42 O
ATOM 11 N PHE A 3 -0.636 6.866 -10.900 1.00 22.07 N
ATOM 12 CA PHE A 3 0.622 6.595 -10.191 1.00 21.70 C
ATOM 13 C PHE A 3 0.279 6.570 -8.716 1.00 20.39 C
ATOM 14 O PHE A 3 -0.265 7.544 -8.167 1.00 23.83 O
ENDMDL
MODEL 2
REMARK VINA RESULT: -7.828 0.000 0.000
REMARK INTER + INTRA: -13.769
REMARK INTER: -10.110
REMARK INTRA: -3.659
REMARK UNBOUND: -3.196
ATOM 1 N SER A 1 -2.390 4.343 -17.003 1.00 27.76 N1+
ATOM 2 CA SER A 1 -2.066 5.647 -16.370 1.00 27.12 C
ATOM 3 C SER A 1 -2.394 5.608 -14.874 1.00 26.29 C
ATOM 4 O SER A 1 -3.014 4.627 -14.405 1.00 22.93 O
ATOM 5 CB SER A 1 -2.771 6.798 -17.057 1.00 28.10 C
ATOM 6 OG SER A 1 -2.538 8.023 -16.373 1.00 32.02 O
ATOM 7 N GLY A 2 -1.982 6.655 -14.162 1.00 25.31 N
ATOM 8 CA GLY A 2 -2.172 6.779 -12.716 1.00 24.93 C
ATOM 9 C GLY A 2 -0.888 6.336 -12.067 1.00 23.66 C
ATOM 10 O GLY A 2 -0.168 5.459 -12.608 1.00 27.42 O
ATOM 11 N PHE A 3 -0.636 6.866 -10.900 1.00 22.07 N
ATOM 12 CA PHE A 3 0.622 6.595 -10.191 1.00 21.70 C
ATOM 13 C PHE A 3 0.279 6.570 -8.716 1.00 20.39 C
ATOM 14 O PHE A 3 -0.265 7.544 -8.167 1.00 23.83 O
ENDMDL
MODEL 3
REMARK VINA RESULT: -7.828 0.000 0.000
REMARK INTER + INTRA: -13.769
REMARK INTER: -10.110
REMARK INTRA: -3.659
REMARK UNBOUND: -3.196
ATOM 1 N SER A 1 -2.390 4.343 -17.003 1.00 27.76 N1+
ATOM 2 CA SER A 1 -2.066 5.647 -16.370 1.00 27.12 C
ATOM 3 C SER A 1 -2.394 5.608 -14.874 1.00 26.29 C
ATOM 4 O SER A 1 -3.014 4.627 -14.405 1.00 22.93 O
ATOM 5 CB SER A 1 -2.771 6.798 -17.057 1.00 28.10 C
ATOM 6 OG SER A 1 -2.538 8.023 -16.373 1.00 32.02 O
ATOM 7 N GLY A 2 -1.982 6.655 -14.162 1.00 25.31 N
ATOM 8 CA GLY A 2 -2.172 6.779 -12.716 1.00 24.93 C
ATOM 9 C GLY A 2 -0.888 6.336 -12.067 1.00 23.66 C
ATOM 10 O GLY A 2 -0.168 5.459 -12.608 1.00 27.42 O
ATOM 11 N PHE A 3 -0.636 6.866 -10.900 1.00 22.07 N
ATOM 12 CA PHE A 3 0.622 6.595 -10.191 1.00 21.70 C
ATOM 13 C PHE A 3 0.279 6.570 -8.716 1.00 20.39 C
ATOM 14 O PHE A 3 -0.265 7.544 -8.167 1.00 23.83 O
ENDMDL
I have tried to use cat BUT it just fused the both files together without the required replication of the first file:
cat file1.pdb file2.pdb > together.pdb
Need I pipe this to some expression of grep in order to replicate the file1 in the positions before the ENDMDL of the file 2 ?
Here is an awk solution that doesn't call unsafe system
or getline
:
awk 'NR==FNR {s = s $0 ORS; next} $0 == "ENDMDL" {$0 = s $0} 1' file1 file2
If you want to pass shell variable names then use:
awk 'NR==FNR {s = s $0 ORS; next}
$0 == "ENDMDL" {$0 = s $0} 1' "$file1" "$file2"