Search code examples
awkuniqnslookup

How to get count of unique IP addresses and errors in shell scripting using uniq or awk?


I am doing a nslookup on URLs for multiple iterations using shell script. I need to check how many times IP was returned for each URL.

In output file, output is stored as

URL 
IP address

using uniq -c command I get the count when same IP addresses are adjacent but not when same IP addresses are on non-adjacent line

Command is 
cat file.log | awk '{print $1}' | uniq -c

here is the sample output

1 url
3 72.51.46.230

Now if multiple IP addresses are returned for a particular URL and they are on non-adjacent lines because I have run no. of iterations. In that case uniq-c command will not work. If I use sort option it sorts but I need to display the output as above for each URL ie. URL and next line with the count and its IP address.

For eg. if I do nslookup on google.com it will return multiple addresses and I do uniq -c I get following output. As you see there are same IP addresses but count is only 1 as uniq -c does not work on non-adjacent lines.

  1 74.125.236.64
  1 74.125.236.78
  1 74.125.236.67
  1 74.125.236.72
  1 74.125.236.65
  1 74.125.236.73
  1 74.125.236.70
  1 74.125.236.66
  1 74.125.236.68
  1 74.125.236.71
  1 74.125.236.69
  1 nslookup: can't resolv 'google.com'
  1 nslookup: can't resolv 'google.com'
  1 nslookup: can't resolv 'google.com'
  1 nslookup: can't resolv 'google.com'
  1 nslookup: can't resolv 'google.com'
  1 nslookup: can't resolv 'google.com'
  1 nslookup: can't resolv 'google.com'
  1 74.125.236.70
  1 74.125.236.66
  1 74.125.236.68
  1 74.125.236.71
  1 74.125.236.69

I tried with AWK as well but in that case output is not formatted as I require.

Awk command

awk '{a[$0]++}END{for (i in a) printf "%-2d -> %s \n", a[i], i}' file.log

Can you suggest a better solution to achieve this - Get count and Display in the format as mentioned above?

Output format desired is

URL
Count IP address

sample input file.

URL1
72.51.46.230
72.51.46.230
google.com
74.125.236.64
74.125.236.78
(null)
nslookup: can't resolv 'google.com'
nslookup: can't resolv 'google.com'
nslookup: can't resolv 'google.com'
nslookup: can't resolv 'google.com'
nslookup: can't resolv 'google.com'

Sample Output required as

URL1
2 72.51.46.230
google.com
1 74.125.236.64
1 74.125.236.78
1 null
5 nslookup: can't resolv 'google.com'

Thank you.


Solution

  • The following awk script does the job:

    $1~/[a-z]+[.].*/{         # If line have a letter in must be a URL 
        for(i in ip)          # Print all the counts and IPs (empty first time)
             print ip[i],i      
        delete ip             # Delete array for next set of IP's
        print                 # Print the URL 
        next                  # Skip to next line
    }
    {
        ip[$0]++              # If here line contains IP, increment the count per IP 
    }
    END{                      # Reached end of file need to print the last set of IPs
        for(i in ip)
            print ip[i],i
    }
    

    Save it as script.awk and run like:

    $ awk -f script.awk file
    creativecommons.org
    2 72.51.46.230
    google.com
    5 nslookup: can't resolv 'google.com'
    1 (null)
    1 74.125.236.64
    1 74.125.236.78