Search code examples
bashawkmultiple-columnssubtractionnegative-number

Cannot subtract negative numbers after declaring one a minimum using bash


The script I am editing takes data from nearly 100 files and pools it into a single file. I am now in the process of trying to process that data. I can pull and pool the data with no problem, my problem comes when it is time to process it.

I am trying to do two things. I would like to find the minimum value of the negative numbers in column 3 and then subtract that minimum value from each value in column 3 and print the results in a new column titled "rel". Currently, I am successfully finding the minimum value but I can't get the subtraction to work.

My input file (titled allRE3) looks like this:

file Gibbs kcal
RR0.out -1752.142111    -1099486.696073 
RR1.out -1752.141887    -1099486.555511 
RR4.out -1752.140564    -1099485.725315 
RR3.out -1752.140319    -1099485.571575 
RR5.out -1752.138532    -1099484.450215 
RR6.out -1752.138493    -1099484.425742 

Currently, the code I am using looks like this:

min=`awk 'BEGIN{a=0}{if ($3<0+a) a=$3} END{print a}' allRE3` 
awk 'NR == 1 { $5 = "rel" } NR >= 3 { $5 = $3 - $min } 1' < allRE3 >finalE

With that code I am getting finalE as a new file (which is desired) and it having the following contents:

file Gibbs kcal  rel
RR0.out -1752.142111 -1099486.696073    
RR1.out -1752.141887 -1099486.555511  -1.09949e+06
RR4.out -1752.140564 -1099485.725315  -1.09949e+06
RR3.out -1752.140319 -1099485.571575  -1.09949e+06    

What I want to get is below and I would like for it to be in a new file titled "finalE".

file Gibbs kcal  rel
RR0.out -1752.142111 -1099486.696073  0.00000
RR1.out -1752.141887 -1099486.555511  0.140562
RR4.out -1752.140564 -1099485.725315  0.970758
RR3.out -1752.140319 -1099485.571575  1.124498

Solution

  • awk is not bash (or any other shell), it's a completely different tool with it's own syntax, semantics, and variables. You can't set a shell variable to be the value of the ouytput of one awk script and then use that shell variable within another awk script. See http://cfajohnson.com/shell/cus-faq-2.html#Q24 for how to use the value of a shell variable in an awk script but you don't need to do that as you should just be using one awk script for everything:

    $ cat tst.awk
    NR==FNR {
        if ( NR > 1 ) {
            min = ( (NR==2) || ($3 < min) ? $3 : min )
        }
        next
    }
    { print $0, ( FNR==1 ? "rel" : sprintf("%0.6f",$3 - min) ) }
    
    $ awk -f tst.awk file file
    file Gibbs kcal rel
    RR0.out -1752.142111    -1099486.696073  0.000000
    RR1.out -1752.141887    -1099486.555511  0.140562
    RR4.out -1752.140564    -1099485.725315  0.970758
    RR3.out -1752.140319    -1099485.571575  1.124498
    RR5.out -1752.138532    -1099484.450215  2.245858
    RR6.out -1752.138493    -1099484.425742  2.270331