Search code examples
bashawkmetrics

AWK get avarage value with group by 2 fields


I finded how to get summary value grouping column by PID:

iotop -botqqqk -n 10 |  awk '{print $13,$2,$5}'| sort -rnk 1  | awk '                         
  { a[$2] += $3 }
  END {
    for (i in a) {
      printf "top_10_read{pid=\"%s\",name=\"%s\"} %s\n", i, $1, a[i] | "sort -rnk2";
    }
  }               
'

But I need to get sorted average value "DISK READ" grouping it by PID and PROCESS NAME.

I finded the ready solution: https://github.com/ncabatoff/process-exporter/. But I think that it has not enough details. I already have some "messy" scripts to export info about processes:

echo "TOP 10 CPU"
ps -A -rss -o comm,pcpu | awk -v cpus="$(nproc --all)" '
  { a[$1] += $2 }
  END {
    for (i in a) {
      printf "top_10_cpu{process=\"%s\"} %s\n", i, a[i]/cpus | "sort -rnk2";
    }
  }               
' | head -n 10

echo "TOP 10 RAM"
ps -A -rss -o comm,pmem | awk '                         
  { a[$1] += $2 }
  END {
    for (i in a) {
      printf "top_10_ram{process=\"%s\"} %s\n", i, a[i] | "sort -rnk2";
    }
  }               
' | head -n 10

echo "TOP 10 RSS"
ps -A -o comm,rss | awk '
  { a[$1] += $2 }
  END {
    for (i in a) {
      printf "top_10_rss{process=\"%s\"} %s\n", i, a[i]/1024 | "sort -g -rk2,2";
    }
  }
' | head -n 10

echo "TOP 10 VSZ"
ps -A -o comm,vsz | awk '
  { a[$1] += $2 }
  END {
    for (i in a) {
      printf "top_10_vsz{process=\"%s\"} %s\n", i, a[i]/1024 | "sort -g -rk2,2";
    }
  }
' | head -n 10

echo "TOP 10 SZ"
 ps -A -o comm,sz | awk '
  { a[$1] += $2 }
  END {
    for (i in a) {
      printf "top_10_sz{process=\"%s\"} %s\n", i, a[i]/1024 | "sort -g -rk2,2";
    }
  }
' | head -n 10

But also going to get info about TCP connection status by each process.

Is this cleaver solution in Your opinion or maybe I just waste my time and there is some ready option?

Sample of input:

 #iotop -botqqqk -n 10 |  awk '{print $13,$2,$5}'| sort -rnk 1
    glusterfsd 23976 0.00
    glusterfsd 23976 0.00
    glusterfsd 23975 122.89
    glusterfsd 23975 116.36

Sample of expected output:

    glusterfsd 23976 0.00
    glusterfsd 23975 119.625

Where "119.625" average DISK READ valu for PID 23975.

Regards


Solution

  • 1st Solution: Could you please try following.

    your_comand | awk '{a[$1,$2]++;b[$1,$2]+=$NF} END{for(i in a){print i,b[i]/a[i]}}' 
    


    2nd Solution: In case you want to print output in same order of Input_file's 1st and 2nd field order then try following.

    your_command | awk 'BEGIN{SUBSEP=" "} !c[$1,$2]++{d[++count]=$1 OFS $2} {a[$1,$2]++;b[$1,$2]+=$NF} END{for(i=1;i<=count;i++){print d[i],b[d[i]]/a[d[i]]}}' 
    


    EDIT: BY seeing OP's tried code trying to do this within single awk itself though not tested at all(since sample output of command iotop -botqqqk -n 10 is NOT provided).

    iotop -botqqqk -n 10 | awk 'BEGIN{SUBSEP=" "} !c[$13,$2]++{d[++count]=$13 OFS $2} {a[$13,$2]++;b[$13,$2]+=$5} END{for(i=1;i<=count;i++){print d[i],b[d[i]]/a[d[i]]}}'