Search code examples
bashloopsawksednslookup

Storing the state of an awk query


thanks very much for looking at my thread. I am looking to make a script that reads in a VERY LARGE list of domains, sees which ones resolve, and then store only the ones that resolved to another file.

I currently have this in a script:

nslookup < input.txt - 1.1.1.1 -port=53 2>&1 |
awk '
NR==FNR { list[NR] = $0; next }
/^Name:/                { ++numResults; state="found" }
/Non-existent domain/   { ++numResults; state="not found" }
/NXDOMAIN/              { ++numResults; state="not found" }
/No answer/             { ++numResults; state="not found" }
state == "found"        { print list[numResults]; state="" }
' input.txt - >> output.txt

I also tried an extra line:

/[Cc]an.t find/         { ++numResults; state="not found" }

But somehow the columns/rows aren't lining up. For example, adding in this line hides total_garbage.com from the output (total_garbage.com does not nslookup to a result that contains the words 'Can.t find' so I have no idea what's going on)

The problems are

1 It is not handling the 'Can't find'/'No answer' case (00038a.net is still printed)

2 It is not handling the 'NXDOMAIN' case (total_garbage.com is still printed)

3 It is not handling the 'Name' case (0-0-0-0-0-0-0-0-0-0-0-0-0-10-0-0-0-0-0-0-0-0-0-0-0-0-0.info is missing from output)

4 Lots of newlines are printed at the end (you can see the whitespace in my output)

Sample input to my script:

google.ca
comingsoon.brightside.com
00038a.net
0-0-0-0-0-0-0-0-0-0-0-0-0-10-0-0-0-0-0-0-0-0-0-0-0-0-0.info
total_garbage.com

Desired output of my script:

google.ca
comingsoon.brightside.com
0-0-0-0-0-0-0-0-0-0-0-0-0-10-0-0-0-0-0-0-0-0-0-0-0-0-0.info

Actual output:

google.ca
comingsoon.brightside.com
00038a.net
total_garbage.com








nslookup < input.txt

Server:     127.0.0.1
Address:    127.0.0.1#53

Non-authoritative answer:
Name:   google.ca
Address: 216.58.192.131
Server:     127.0.0.1
Address:    127.0.0.1#53

Non-authoritative answer:
comingsoon.brightside.com   canonical name = elb-brightside-17469.aptible.in.
Name:   elb-brightside-17469.aptible.in
Address: 54.86.171.167
Name:   elb-brightside-17469.aptible.in
Address: 54.174.154.102
Server:     127.0.0.1
Address:    127.0.0.1#53

Non-authoritative answer:
*** Can't find 00038a.net: No answer
Server:     127.0.0.1
Address:    127.0.0.1#53

Non-authoritative answer:
Name:   0-0-0-0-0-0-0-0-0-0-0-0-0-10-0-0-0-0-0-0-0-0-0-0-0-0-0.info
Address: 178.162.203.226
Name:   0-0-0-0-0-0-0-0-0-0-0-0-0-10-0-0-0-0-0-0-0-0-0-0-0-0-0.info
Address: 178.162.203.211
Name:   0-0-0-0-0-0-0-0-0-0-0-0-0-10-0-0-0-0-0-0-0-0-0-0-0-0-0.info
Address: 178.162.203.202
Name:   0-0-0-0-0-0-0-0-0-0-0-0-0-10-0-0-0-0-0-0-0-0-0-0-0-0-0.info
Address: 85.17.31.122
Name:   0-0-0-0-0-0-0-0-0-0-0-0-0-10-0-0-0-0-0-0-0-0-0-0-0-0-0.info
Address: 85.17.31.82
Name:   0-0-0-0-0-0-0-0-0-0-0-0-0-10-0-0-0-0-0-0-0-0-0-0-0-0-0.info
Address: 5.79.71.225
Name:   0-0-0-0-0-0-0-0-0-0-0-0-0-10-0-0-0-0-0-0-0-0-0-0-0-0-0.info
Address: 5.79.71.205
Name:   0-0-0-0-0-0-0-0-0-0-0-0-0-10-0-0-0-0-0-0-0-0-0-0-0-0-0.info
Address: 178.162.217.107
Server:     127.0.0.1
Address:    127.0.0.1#53

** server can't find total_garbage.com: NXDOMAIN

Solution

  • Is this what you're trying to do (using cat nslookup.out | for testing with your provided sample rather than running nslookup ... | locally which would produce different output than you want the awk script to parse)?

    $ cat tst.sh
    #!/bin/env bash
    
    #nslookup < input.txt 2>&1 |
    cat nslookup.out |
    awk '
    NR==FNR { list[NR] = $0; next }
    /^Name:/                { state="found" }
    /[Cc]an\047t find/      { state="not found" }
    !NF && (state != "") {
        ++numResults
        if ( state == "found" ) {
            print list[numResults]
        }
        state=""
    }
    ' input.txt -
    
    $ ./tst.sh
    google.ca
    comingsoon.brightside.com
    0-0-0-0-0-0-0-0-0-0-0-0-0-10-0-0-0-0-0-0-0-0-0-0-0-0-0.info
    

    Past attempts:

    $ cat gravity.list
    comingsoon.brightside.com
    total_garbage.com
    google.com
    
    $ cat tst.sh
    #!/bin/env bash
    
    nslookup < gravity.list 2>&1 |
    awk '
    NR==FNR { list[NR] = $0; next }
    /^Name:/                { result = $NF }
    /Non-existent domain/   { result = "not found" }
    result != "" { print list[++numResults], "->", result; result="" }
    ' gravity.list -
    
    $ ./tst.sh
    comingsoon.brightside.com -> elb-brightside-17469.aptible.in
    total_garbage.com -> not found
    google.com -> google.com
    

    or this?

    $ cat tst.sh
    #!/bin/env bash
    
    nslookup < gravity.list 2>&1 |
    awk '
    NR==FNR { list[NR] = $0; next }
    /^Name:/                { ++numResults; state="found" }
    /Non-existent domain/   { ++numResults; state="not found" }
    state == "found" { print list[numResults]; state="" }
    ' gravity.list -
    
    $ ./tst.sh
    comingsoon.brightside.com
    google.com