AWK Threshold Greater Than

I have text files in the folder which look something like:

[13]pkt_size=140
[31]pkt_size=139
[49]pkt_size=139
[67]pkt_size=140
[85]pkt_size=139
[103]pkt_size=139
[121]pkt_size=140
[139]pkt_size=139
[157]pkt_size=139
[175]pkt_size=140
[193]pkt_size=139
[211]pkt_size=139
[229]pkt_size=3660
[253]pkt_size=140
[271]pkt_size=139
[289]pkt_size=139
[307]pkt_size=5164
[331]pkt_size=140
[349]pkt_size=139
[367]pkt_size=139
[385]pkt_size=7512

I want to set threshold=1000, then I want script to sum every 10 lines in the file , then if the sum is > threshold then print the output.

But I want to run that script for folder and script must create individual file of output.

Solution

This script would process the sum as every 10 lines and print the result if over 1000:

$ cat sum.awk 
BEGIN {
    FS = "="
}
{ acc += $2 }
(NR % 10) == 0 { if (acc > 1000) { print acc } acc = 0; }
$ awk -f sum.awk yourfile.txt 
1394
9938
$

If you want the 1000 threshold to be a parameter, I let you choose how to pass paremeters to awk. For instance you can use the -v var=val in the command line as described here: https://www.gnu.org/software/gawk/manual/gawk.html#Options

About running the command for every file and produce an output file, here xargs comes to the rescue. See this sample here:

$ ls
sum.awk  yourfile.txt  zzzzzzz.txt
$ ls *.txt
yourfile.txt  zzzzzzz.txt
$ ls *.txt | xargs -L 1 -I {} /bin/bash -c 'awk -f sum.awk {} > {}.output'
$ ls
sum.awk  yourfile.txt  yourfile.txt.output  zzzzzzz.txt  zzzzzzz.txt.output
$

xargs will run the command for every line in the input. By default it will try to group several lines in each execution, but we will prevent that with the -L 1 setting.

Next we use the -I {} argument to declare a placeholder string {} that will be the each line (the filename).

Finally: execute the /bin/bash -c '<what to execute>' to run the awk script on our file and redirect the output.

Hope it helps.