I am doing a nslookup on URLs for multiple iterations using shell script. I need to check how many times IP was returned for each URL.
In output file, output is stored as
URL
IP address
using uniq -c command I get the count when same IP addresses are adjacent but not when same IP addresses are on non-adjacent line
Command is
cat file.log | awk '{print $1}' | uniq -c
here is the sample output
1 url
3 72.51.46.230
Now if multiple IP addresses are returned for a particular URL and they are on non-adjacent lines because I have run no. of iterations. In that case uniq-c command will not work. If I use sort option it sorts but I need to display the output as above for each URL ie. URL and next line with the count and its IP address.
For eg. if I do nslookup on google.com it will return multiple addresses and I do uniq -c I get following output. As you see there are same IP addresses but count is only 1 as uniq -c does not work on non-adjacent lines.
1 74.125.236.64
1 74.125.236.78
1 74.125.236.67
1 74.125.236.72
1 74.125.236.65
1 74.125.236.73
1 74.125.236.70
1 74.125.236.66
1 74.125.236.68
1 74.125.236.71
1 74.125.236.69
1 nslookup: can't resolv 'google.com'
1 nslookup: can't resolv 'google.com'
1 nslookup: can't resolv 'google.com'
1 nslookup: can't resolv 'google.com'
1 nslookup: can't resolv 'google.com'
1 nslookup: can't resolv 'google.com'
1 nslookup: can't resolv 'google.com'
1 74.125.236.70
1 74.125.236.66
1 74.125.236.68
1 74.125.236.71
1 74.125.236.69
I tried with AWK as well but in that case output is not formatted as I require.
Awk command
awk '{a[$0]++}END{for (i in a) printf "%-2d -> %s \n", a[i], i}' file.log
Can you suggest a better solution to achieve this - Get count and Display in the format as mentioned above?
Output format desired is
URL
Count IP address
sample input file.
URL1
72.51.46.230
72.51.46.230
google.com
74.125.236.64
74.125.236.78
(null)
nslookup: can't resolv 'google.com'
nslookup: can't resolv 'google.com'
nslookup: can't resolv 'google.com'
nslookup: can't resolv 'google.com'
nslookup: can't resolv 'google.com'
Sample Output required as
URL1
2 72.51.46.230
google.com
1 74.125.236.64
1 74.125.236.78
1 null
5 nslookup: can't resolv 'google.com'
Thank you.
The following awk
script does the job:
$1~/[a-z]+[.].*/{ # If line have a letter in must be a URL
for(i in ip) # Print all the counts and IPs (empty first time)
print ip[i],i
delete ip # Delete array for next set of IP's
print # Print the URL
next # Skip to next line
}
{
ip[$0]++ # If here line contains IP, increment the count per IP
}
END{ # Reached end of file need to print the last set of IPs
for(i in ip)
print ip[i],i
}
Save it as script.awk
and run like:
$ awk -f script.awk file
creativecommons.org
2 72.51.46.230
google.com
5 nslookup: can't resolv 'google.com'
1 (null)
1 74.125.236.64
1 74.125.236.78