Search code examples
awk

why is there a difference in behaviour between these two awk commands?


I have a tab separated file with these contents, lets call it test.dat:

Hf    0.26748    0.24144    0.03301
Hf    0.26748    0.74144    0.46699
Hf    0.73252    0.24144    0.53301
Hf    0.73252    0.74144    0.96699
O     0.07135    0.37927    0.36787
O     0.07135    0.87927    0.13213
O     0.46012    0.49229    0.73395
O     0.46012    0.99229    0.76605
O     0.53988    0.99229    0.26605
O     0.53988    0.49229    0.23395
O     0.92865    0.37927    0.86787
O     0.92865    0.87927    0.63213

Now I want to subtract 0.25 from column 2 and 3. So I do this:

awk '{print $1 " " ($2-0.25) " " ($3-0.25) " " $4}' test.dat

The output is:

Hf   0.01748   -0.00856   0.03301
Hf   0.01748   0.49144   0.46699
Hf   0.48252   -0.00856   0.53301
Hf   0.48252   0.49144   0.96699
O   -0.17865   0.12927   0.36787
O   -0.17865   0.62927   0.13213
O   0.21012   0.24229   0.73395
O   0.21012   0.74229   0.76605
O   0.28988   0.74229   0.26605
O   0.28988   0.24229   0.23395
O   0.67865   0.12927   0.86787
O   0.67865   0.62927   0.63213

But I wanted the new output to be tab separated instead of the " ". So I tried:

awk -v OFS='\t' -v FS='\t' '{print $1 " " ($2-0.25) " " ($3-0.25) " " $4}' test.dat

And the output is now weird:

Hf    0.26748    0.24144    0.03301-0.25-0.25
Hf    0.26748    0.74144    0.46699-0.25-0.25
Hf    0.73252    0.24144    0.53301-0.25-0.25
Hf    0.73252    0.74144    0.96699-0.25-0.25
O     0.07135    0.37927    0.36787-0.25-0.25
O     0.07135    0.87927    0.13213-0.25-0.25
O     0.46012    0.49229    0.73395-0.25-0.25
O     0.46012    0.99229    0.76605-0.25-0.25
O     0.53988    0.99229    0.26605-0.25-0.25
O     0.53988    0.49229    0.23395-0.25-0.25
O     0.92865    0.37927    0.86787-0.25-0.25
O     0.92865    0.87927    0.63213-0.25-0.25

What am I doing wrong?

Important Edit:

As @Renaud Pacalet correctly observed, I was confusing tab separation and alignment. I wanted an output with aligned columns! Hence the accepted answer below!

Edit:

Results of the suggestions below:

$ awk -v OFS='\t' -v FS='\t' '{print $1, ($2-0.25), ($3-0.25), $4}' hfo-test.dat
Hf    0.26748    0.24144    0.03301     -0.25   -0.25
Hf    0.26748    0.74144    0.46699     -0.25   -0.25
Hf    0.73252    0.24144    0.53301     -0.25   -0.25
Hf    0.73252    0.74144    0.96699     -0.25   -0.25
O     0.07135    0.37927    0.36787     -0.25   -0.25
O     0.07135    0.87927    0.13213     -0.25   -0.25
O     0.46012    0.49229    0.73395     -0.25   -0.25
O     0.46012    0.99229    0.76605     -0.25   -0.25
O     0.53988    0.99229    0.26605     -0.25   -0.25
O     0.53988    0.49229    0.23395     -0.25   -0.25
O     0.92865    0.37927    0.86787     -0.25   -0.25
O     0.92865    0.87927    0.63213     -0.25   -0.25

Another try:

$ awk -v OFS='\t' '{$2-=0.25; $3-=0.25; print}' hfo-test.dat 
Hf      0.01748 -0.00856        0.03301
Hf      0.01748 0.49144 0.46699
Hf      0.48252 -0.00856        0.53301
Hf      0.48252 0.49144 0.96699
O       -0.17865        0.12927 0.36787
O       -0.17865        0.62927 0.13213
O       0.21012 0.24229 0.73395
O       0.21012 0.74229 0.76605
O       0.28988 0.74229 0.26605
O       0.28988 0.24229 0.23395
O       0.67865 0.12927 0.86787
O       0.67865 0.62927 0.63213

Solution

  • The difference between your 2 first tries and the 2 next ones comes from the fact that your input file is not tab-separated. So with -v FS='\t' the whole line is considered as the first (and unique) field.

    The output of the fifth try is tab-separated but what you apparently want is not a tab-separated output, it is an output with aligned columns. Try:

    awk '{$2-=0.25; $3-=0.25; print}' test.dat | column -t
    

    or, if you prefer a right alignment:

    awk '{$2-=0.25; $3-=0.25; print}' test.dat | column -tR0