I'm trying to solve a problem in awk as an exercise but I'm having trouble.
I want awk (or gawk) to be able to print all unique destination ports for a particular source IP address.
The source IP address is field 1 ($1) and the destination port is field 4 ($4).
Cut for brevity:
SourceIP SrcPort DstIP DstPort
192.168.1.195 59508 98.129.121.199 80
192.168.1.87 64802 192.168.1.2 53
10.1.1.1 41170 199.253.249.63 53
10.1.1.1 62281 204.14.233.9 443
I imagine you would store each Source IP as in index to an array. But I'm not quite sure how you would store destination ports as values. Maybe you can keep appending to a string, being the value of the index e.g. "80,"..."80,443,"... for each match. But maybe that's not the best solution.
I'm not too concerned about output, I really just want to see how one can approach this in awk. Though, for output I was thinking something like,
Source IP:dstport, dstport, dstport
192.168.1.195:80,443,8088,5900
I'm tinkering with something like this,
awk '{ if ( NR == 1) next; arr[$1,$4] = $4 } END { for (i in arr) print arr[i] }' infile
but cannot figure out how to print out the elements and their values for a two-dimensional array. It seems something along this line would take care of the unique destination port task because each port is overwriting the value of the element.
Note: awk/gawk solution will get the answer!
Solution EDIT: slightly modified Kent's solution to print unique destination ports as mentioned in my question and to skip the column header line.
awk '{ if ( NR == 1 ) next ; if ( a[$1] && a[$1] !~ $4 ) a[$1] = a[$1]","$4; else a[$1] = $4 } END {for(x in a)print x":"a[x]}'
here is one way with awk:
awk '{k=$1;a[k]=a[k]?a[k]","$4:$4}END{for(x in a)print x":"a[x]}' file
with your example, the output is:
kent$ awk '{k=$1;a[k]=a[k]?a[k]","$4:$4}END{for(x in a)print x":"a[x]}' file
192.168.1.195:80
192.168.1.87:53
10.1.1.1:53,443
(I omitted the title line)
EDIT
k=$1;a[k]=a[k]?a[k]","$4:$4
is exactly same as:
if (a[$1]) # if a[$1] is not empty
a[$1] = a[$1]","$4 # concatenate $4 to it separated by ","
else # else if a[$1] is empty
a[$1] = $4 # let a[$1]=$4
I used k=$1
just for saving some typing. also the x=boolean?a:b
expression
I hope the explanation could let you understand the codes.