I have a large file full of lines like this...
19:54:05 10.10.8.5 [SERVER] Response sent: www.example.com. type A by 192.168.4.5
19:55:10 10.10.8.5 [SERVER] Response sent: ns1.example.com. type A by 192.168.4.5
19:55:23 10.10.8.5 [SERVER] Response sent: ns1.example.com. type A by 192.168.4.5
I don't care about any of the other data, only what's after the "response sent:" I'd like a sorted list of the most common occurrences of the domain-names. Problem is I won't know all the domain-names in advance, so I can't just do a search for the string.
Using the example above I'd like the output to be along the lines of
ns1.example.com (2)
www.example.com (1)
...where the number in ( ) is the counts of that occurrence.
How/what could I use to do this on Windows? The input file is .txt - the output file can be anything. Ideally a command-line process, but I'm really lost so I'd be happy with anything.
Cat is kinda out of the bag so lets try and help a little. This is a PowerShell solution. If you are having issues with how this works I encourage you to research the individual parts.
If you text file was "D:\temp\test.txt" then you could do something like this.
$results = Select-String -Path D:\temp\test.txt -Pattern "(?<=sent: ).+(?= type)" | Select -Expand Matches | Select -Expand Value
$results | Group-Object | Select-Object Name,Count | Sort-Object Count -Descending
Using your input you would get this for output
Name Count
---- -----
ns1.example.com. 2
www.example.com. 1
Since there is regex I have saved a link that explains how it works.
Please keep in mind that SO is, of course, a site that helps programmers and programming enthusiasts. We are devoting our free time where as some people get paid to do this.