Is it at all possible to get bc
to compute based on the output of a cut
command?
Lets say I have the following column based file:
PAK_01896 PAU_03392 75.8 149 32 1 1 145 1 149 * *
PAK_02014 PAU_03392 69.8 149 45 0 1 149 1 149 * *
PAU_02074 PAU_03392 77.2 149 30 1 1 145 1 149 * *
PAU_02206 PAU_03392 69.1 149 46 0 1 149 1 149 * *
PAU_02775 PAU_03392 79.2 149 31 0 1 149 1 149 * *
PAK_02606 PAU_03392 78.5 149 32 0 1 149 1 149 * *
PAU_01961 PAU_03392 67.1 149 49 0 1 149 1 149 * *
PAK_03203 PAU_03392 95.3 149 7 0 1 149 1 149 * *
PLT_01716 PAU_03392 76.5 149 35 0 1 149 1 149 * *
PLT_01758 PAU_03392 79.2 149 31 0 1 149 1 149 * *
PAU_03392 PAU_03392 100.0 149 0 0 1 149 1 149 * *
PLT_01696 PAU_03392 78.5 149 32 0 1 149 1 149 * *
PLT_02424 PAU_03392 78.5 149 32 0 1 149 1 149 * *
PLT_01736 PAU_03392 77.2 149 34 0 1 149 1 149 * *
PLT_02568 PAU_03392 67.1 149 49 0 1 149 1 149 * *
PAK_01787 PAU_03392 66.4 149 50 0 1 149 1 149 * *
I'd like to be able to perform some calculation on certain fields, for example something to the effect of summing and/or averaging the 3rd column. In my head I first thought to try this:
cut -f3 column_based_file.txt | bc
But perhaps unsurprisingly this just returns the value of each item in column 3.
I know there are workable solutions to this in threads such as this one that I could use, but since cut has been my go-to way of manipulating column based data in bash for a while, I'm just wondering if it is at all possible? Maybe bc
has some flag for reading in one line at a time and storing them etc.
EDIT There are some great solutions in the threads suggested, and in the answers given. Out of curiosity, since that's how I'd originally thought to do it, does anyone have a cut
and bc
based solution (if for some reason perl or awk weren't available perhaps?)
I would use awk. It is in my humble opinion better suited for this task. Say your data is stored in sumavg.csv
, then this GNU awk script (sumavg.awk
) shows sum and average of the third field:
{s += $3 }
END {print "Sum:", s, " Avg: ", s / FNR}
Run it with the command awk -f sumavg.awk sumavg.csv
.
$3
is the third field on each line, END
is a special pattern its action is executed in the end, FNR
gives the number of rows in the file.