Search code examples
unixawkcountcygwin

How to count the contents of one column by two others with awk?


Say I have a file with three columns, as follows:

00:00:01  Login     Steve
00:00:01  Install   Sarah
00:00:01  Install   Sarah
00:00:02  Explorer  Sarah
00:00:02  Explorer  Sarah
00:00:02  Install   Steve
00:00:02  Firewall  Sarah
00:00:02  Logout    Steve
00:00:04  Logout    Sarah

Is it possible to use awk to count up the unique actions each user performs in each time stamp, so the output is something like this:

00:00:01 Steve Login 1
00:00:01 Sarah Install 2
00:00:02 Sarah Explorer 2
00:00:02 Steve Install 1
00:00:02 Sarah Firewall 1
00:00:02 Steve Logout
00:00:04 Sarah Logout

This is the closest I've come:

awk '{count[$1,$3,$2]++}END{for (i in count){split(i,a,SUBSEP); print a[1],a[2],count[i]}}' awktest.txt

Which give me this result:

00:00:02 Sarah 1
00:00:02 Steve 1
00:00:02 Steve 1
00:00:01 Steve 1
00:00:04 Sarah 1
00:00:02 Sarah 2
00:00:01 Sarah 1
00:00:01 Sarah 1

I'm doing this in Cygwin.


Solution

  • $ awk -F"\t" -v OFS="\t" '{arr[$0]+=1} END {for(i in arr) print i,arr[i]}' test.in
    00:00:01        Install Sarah   2
    00:00:04        Logout  Sarah   1
    00:00:02        Firewall        Sarah   1
    00:00:01        Login   Steve   1
    00:00:02        Logout  Steve   1
    00:00:02        Install Steve   1
    00:00:02        Explorer        Sarah   2