Search code examples
bashcutbc

piping the output of a cut operation to bc


Is it at all possible to get bc to compute based on the output of a cut command?

Lets say I have the following column based file:

PAK_01896       PAU_03392       75.8    149     32      1       1       145     1       149     *       *
PAK_02014       PAU_03392       69.8    149     45      0       1       149     1       149     *       *
PAU_02074       PAU_03392       77.2    149     30      1       1       145     1       149     *       *
PAU_02206       PAU_03392       69.1    149     46      0       1       149     1       149     *       *
PAU_02775       PAU_03392       79.2    149     31      0       1       149     1       149     *       *
PAK_02606       PAU_03392       78.5    149     32      0       1       149     1       149     *       *
PAU_01961       PAU_03392       67.1    149     49      0       1       149     1       149     *       *
PAK_03203       PAU_03392       95.3    149     7       0       1       149     1       149     *       *
PLT_01716       PAU_03392       76.5    149     35      0       1       149     1       149     *       *
PLT_01758       PAU_03392       79.2    149     31      0       1       149     1       149     *       *
PAU_03392       PAU_03392       100.0   149     0       0       1       149     1       149     *       *
PLT_01696       PAU_03392       78.5    149     32      0       1       149     1       149     *       *
PLT_02424       PAU_03392       78.5    149     32      0       1       149     1       149     *       *
PLT_01736       PAU_03392       77.2    149     34      0       1       149     1       149     *       *
PLT_02568       PAU_03392       67.1    149     49      0       1       149     1       149     *       *
PAK_01787       PAU_03392       66.4    149     50      0       1       149     1       149     *       *

I'd like to be able to perform some calculation on certain fields, for example something to the effect of summing and/or averaging the 3rd column. In my head I first thought to try this:

 cut -f3 column_based_file.txt | bc

But perhaps unsurprisingly this just returns the value of each item in column 3.

I know there are workable solutions to this in threads such as this one that I could use, but since cut has been my go-to way of manipulating column based data in bash for a while, I'm just wondering if it is at all possible? Maybe bc has some flag for reading in one line at a time and storing them etc.

EDIT There are some great solutions in the threads suggested, and in the answers given. Out of curiosity, since that's how I'd originally thought to do it, does anyone have a cut and bc based solution (if for some reason perl or awk weren't available perhaps?)


Solution

  • I would use awk. It is in my humble opinion better suited for this task. Say your data is stored in sumavg.csv, then this GNU awk script (sumavg.awk) shows sum and average of the third field:

        {s += $3 }
    END {print "Sum:", s, " Avg: ", s / FNR}
    

    Run it with the command awk -f sumavg.awk sumavg.csv.

    $3 is the third field on each line, END is a special pattern its action is executed in the end, FNR gives the number of rows in the file.