Search code examples
awkmaxminperiod

Division a column in period and print min max for each in awk


I have a data file which content two columns. One of them have periodic variation of whom the max and min are different in each period :

a     3
b     4
c     5
d     4
e     3
f     2
g     1
h     2
i     3
j     4
k     5
l     6
m     5
n     4
o     3
p     2
q     1
r     0
s     1
t     2
u     3

We can find that in the 1st period (from a to i): max = 5, min = 1. In the 2nd period (from i to u) : max = 6, min = 0.

Using awk, I can only print the max and min of all second column, but I cannot print these values min and max after each period. That means I wish to obtain results like this :

period   min   max
1        1     5
2        0     6

Here is what I did :

{
nb_lignes = 21
period = 9
nb_periodes = int(nb_lignes/period)
}

{
for (j = 0; j <= nb_periodes; j++)
   {   if (NR == (1 + period*j)) {{max=$2 ; min=$2}}
       for (i = (period*j); i <= (period*(j+1)); i++)
           {
               if (NR == i) 
                  { 
                     if ($2 >= max) {max = $2} 
                     if ($2 <= min) {min = $2} 
                     {print "Min: "min,"Max: "max,"Ligne: " NR}
                  }
           }
   }
}
#END { print "Min: "min,"Max: "max }

However the result is far away from what I search for :

Min: 3 Max: 3 Ligne: 1
Min: 3 Max: 4 Ligne: 2
Min: 3 Max: 5 Ligne: 3
Min: 3 Max: 5 Ligne: 4
Min: 3 Max: 5 Ligne: 5
Min: 2 Max: 5 Ligne: 6
Min: 1 Max: 5 Ligne: 7
Min: 1 Max: 5 Ligne: 8
Min: 1 Max: 5 Ligne: 9
Min: 1 Max: 5 Ligne: 9
Min: 4 Max: 4 Ligne: 10
Min: 4 Max: 5 Ligne: 11
Min: 4 Max: 6 Ligne: 12
Min: 4 Max: 6 Ligne: 13
Min: 4 Max: 6 Ligne: 14
Min: 3 Max: 6 Ligne: 15
Min: 2 Max: 6 Ligne: 16
Min: 1 Max: 6 Ligne: 17
Min: 0 Max: 6 Ligne: 18
Min: 0 Max: 6 Ligne: 18
Min: 1 Max: 1 Ligne: 19
Min: 1 Max: 2 Ligne: 20
Min: 1 Max: 3 Ligne: 21

Thank you in advance for you help.


Solution

  • Try something like:

    $ awk '
    BEGIN{print "period", "min", "max"}
    !f{min=$2; max=$2; ++f; next}
    {max = ($2>max)?$2:max; min = ($2<min)?$2:min; f++}
    f==9{print ++a, min, max; f=0}' file
    period min max
    1 1 5
    2 0 6
    
    • When the flag f is not set, you assign the second column to max and min variables and start incrementing your flag.
    • For each line, check the second column. If it is bigger than our max variable assign that column to max. Like wise, if it is smaller than our min variable, assign it to our min variable. Keep incrementing the flag.
    • Once the flag reaches 9, print the period number, min and max variables. Reset the flag to 0 and start again afresh from next line.