I have text files in the folder which look something like:
[13]pkt_size=140
[31]pkt_size=139
[49]pkt_size=139
[67]pkt_size=140
[85]pkt_size=139
[103]pkt_size=139
[121]pkt_size=140
[139]pkt_size=139
[157]pkt_size=139
[175]pkt_size=140
[193]pkt_size=139
[211]pkt_size=139
[229]pkt_size=3660
[253]pkt_size=140
[271]pkt_size=139
[289]pkt_size=139
[307]pkt_size=5164
[331]pkt_size=140
[349]pkt_size=139
[367]pkt_size=139
[385]pkt_size=7512
I want to set threshold=1000
, then I want script to sum every 10
lines in the file , then if the sum is > threshold then print the output.
But I want to run that script for folder and script must create individual file of output.
This script would process the sum as every 10 lines and print the result if over 1000:
$ cat sum.awk
BEGIN {
FS = "="
}
{ acc += $2 }
(NR % 10) == 0 { if (acc > 1000) { print acc } acc = 0; }
$ awk -f sum.awk yourfile.txt
1394
9938
$
If you want the 1000 threshold to be a parameter, I let you choose how to pass paremeters to awk. For instance you can use the -v var=val
in the command line as described here: https://www.gnu.org/software/gawk/manual/gawk.html#Options
About running the command for every file and produce an output file, here xargs
comes to the rescue. See this sample here:
$ ls
sum.awk yourfile.txt zzzzzzz.txt
$ ls *.txt
yourfile.txt zzzzzzz.txt
$ ls *.txt | xargs -L 1 -I {} /bin/bash -c 'awk -f sum.awk {} > {}.output'
$ ls
sum.awk yourfile.txt yourfile.txt.output zzzzzzz.txt zzzzzzz.txt.output
$
xargs
will run the command for every line in the input. By default it will try to group several lines in each execution, but we will prevent that with the -L 1
setting.
Next we use the -I {}
argument to declare a placeholder string {}
that will be the each line (the filename).
Finally: execute the /bin/bash -c '<what to execute>'
to run the awk script on our file and redirect the output.
Hope it helps.