Search code examples
awksedcut

How to replace some particular column values in a file from the columns of other file


I know similar questions has been answered many time on SO. (one example is here: https://stackoverflow.com/questions/7846476/replace-column-in-one-file-with-column-from-another-using-awk)

But, this is something unique in my case as I need to take care of a particular pattern.

header of my file1, that I want to get updated, is

  3    6  0  6.0361821  0.0000000  0.0000000  0.0000000  0.0000000  0.0000000
      0.994429353    0.000000000    0.000000000
      0.000000000    0.994429353    0.000000000
      0.000000000    0.000000000    2.469627493
           1  'A '    63548.626894188397     
           2  'B '    169717.29799472401     
           3  'C '    25598.367262405900     
    1    2      0.7458220147      0.7458220147      1.8031927376   << need to be updated from here
    2    2      0.2486073382      0.2486073382      0.6664347554
    3    1      0.2486073382      0.2486073382      2.2628589536
    4    1      0.7458220147      0.7458220147      0.2067685394
    5    3      0.7458220147      0.7458220147      1.0275486366
    6    3      0.2486073382      0.2486073382      1.4420788564  << upto here
 T
     21.3496599      0.0000000      0.0000000
      0.0000000     21.3496599      0.0000000
      0.0000000      0.0000000     24.1101752
    1
     -7.6119990     -0.0000000      0.0000000
      0.0000000     -7.6119990      0.0000000
      0.0000000      0.0000000     -7.0331945
    2
     -7.6119990      0.0000000      0.0000000
     -0.0000000     -7.6119990      0.0000000
      0.0000000      0.0000000     -7.0331945
    3
      3.4711749      0.0000000      0.0000000

I need to update the $2, $3 and $4 of file1 from $ESPi"th line to "$ESPf"th with $1, $2 and $3 of file2 (mentioned below). The spaces in file1 should not change while updating. Here $ESPi"th and "$ESPf"th represents 8th line and 13th lines, respectively and changes case to case.

file2 is

0.750000000 0.750000000 0.730147661   << with these data 
0.250000000 0.250000000 0.269852339
0.250000000 0.250000000 0.916275414
0.750000000 0.750000000 0.083724586
0.750000000 0.750000000 0.416074343
0.250000000 0.250000000 0.583925657  < upto these data

I have tried to do my job with.

#!/bin/bash
for j in `seq "$ESPi" 1 "$ESPf"`    # ESPi and ESPf are 8 and 13, respectively here and change case by case.
do
ESP1=$(cat file1 | head -n "$j" | tail -n 1 | awk '{print $3}')
ESP2=$(cat file1 | head -n "$j" | tail -n 1 | awk '{print $4}')
ESP3=$(cat file1 | head -n "$j" | tail -n 1 | awk '{print $5}')

for k in `seq 1 1 "$NELEMENTS"`  # $NELEMENTS is six here.
do
qeIN1=$(cat file2 | head -n "$k" | tail -n 1 | awk '{print $1}')
qeIN2=$(cat file2 | head -n "$k" | tail -n 1 | awk '{print $2}')
qeIN3=$(cat file2 | head -n "$k" | tail -n 1 | awk '{print $3}')
sed  's/'$ESP1'/'$qeIN1'/g' file1
sed  's/'$ESP2'/'$qeIN2'/g' file1
sed  's/'$ESP3'/'$qeIN3'/g' file1
done
done

This gives me

3    6  0  6.0361821  0.0000000  0.0000000  0.0000000  0.0000000  0.0000000
      0.994429353    0.000000000    0.000000000
      0.000000000    0.994429353    0.000000000
      0.000000000    0.000000000    2.469627493
           1  'A '    63548.626894188397
           2  'B '    169717.29799472401
           3  'C '    25598.367262405900
    1    2      0.7458220147      0.7458220147      1.8031927376
    2    2      0.750000000      0.750000000      0.6664347554
    3    1      0.750000000      0.750000000      2.2628589536
    4    1      0.7458220147      0.7458220147      0.2067685394
    5    3      0.7458220147      0.7458220147      1.0275486366
    6    3      0.750000000      0.750000000      1.4420788564
 T
     21.3496599      0.0000000      0.0000000
      0.0000000     21.3496599      0.0000000
      0.0000000      0.0000000     24.1101752
    1
     -7.6119990     -0.0000000      0.0000000
      0.0000000     -7.6119990      0.0000000
      0.0000000      0.0000000     -7.0331945
    2
     -7.6119990      0.0000000      0.0000000
     -0.0000000     -7.6119990      0.0000000
      0.0000000      0.0000000     -7.0331945
    3
      3.4711749      0.0000000      0.0000000

The expected output is

  3    6  0  6.0361821  0.0000000  0.0000000  0.0000000  0.0000000  0.0000000
      0.994429353    0.000000000    0.000000000
      0.000000000    0.994429353    0.000000000
      0.000000000    0.000000000    2.469627493
           1  'A '    63548.626894188397
           2  'B '    169717.29799472401
           3  'C '    25598.367262405900
    1    2      0.750000000      0.750000000      0.730147661
    2    2      0.250000000      0.250000000      0.269852339
    3    1      0.250000000      0.250000000      0.916275414
    4    1      0.750000000      0.750000000      0.083724586
    5    3      0.750000000      0.750000000      0.416074343
    6    3      0.250000000      0.250000000      0.583925657
 T
     21.3496599      0.0000000      0.0000000
      0.0000000     21.3496599      0.0000000
      0.0000000      0.0000000     24.1101752
    1
     -7.6119990     -0.0000000      0.0000000
      0.0000000     -7.6119990      0.0000000
      0.0000000      0.0000000     -7.0331945
    2
     -7.6119990      0.0000000      0.0000000
     -0.0000000     -7.6119990      0.0000000
      0.0000000      0.0000000     -7.0331945
    3
      3.4711749      0.0000000      0.0000000

I am looking for shell (bash) script.


Solution

  • #!/bin/bash
    
    ESPi=8
    ESPf=13
    
    python > file1.new <<EOF
    import sys, re
    write = sys.stdout.write
    espi = $ESPi
    espf = $ESPf
    repls = {2:0, 3:1, 4:2}
    with open("file1") as f1, open("file2") as f2:
        for i in range(espi - 1): write(next(f1))
        for i in range(espf - espi + 1):
            line = next(f1)
            toks = next(f2).split()
            for col, rcol in repls.items():
                pat = "(\s*)((\S+\s+){{{col}}})(\S+)(.*)".format(col=col)
                repl = r"\g<1>\g<2>{val}\g<5>".format(val=toks[rcol])
                line = re.sub(pat, repl, line)
            write(line)
        for line in f1: write(line)
    EOF
    
    mv file1.new file1