Search code examples
regexpython-3.xreverse-dns

Reverse DNS script with Regex in Python


I am currently working on a reverse DNS script intended to open a log file, find the IP address, then resolve the IP to DNS. I have a regex set up to identify the IP address in the file, but when I added socket.gethostbyaddr to my script the script ignores my regex and still lists objects in the file that are not IP addresses. I've never used Python before, but this is what I have right now:

import socket
import re

f = open('ipidk.txt' , 'r')

lines = f.readlines()

raw_data = str(f.readlines())

regex = r'(?:[\d]{1,3})\.(?:[\d]{1,3})\.(?:[\d]{1,3})\.(?:[\d]{1,3})'

foundip = re.findall( regex, raw_data )

for raw_data in lines:
    host = raw_data.strip()

    try:
        dns = socket.gethostbyaddr(host)
        print("%s - %s" % (host, dns))
    except socket.error as exc:
        pass

        f.close()

Solution

  • You're calling f.readlines() twice. The first time reads everything in the file, and puts that in lines. The second time has nothing left to read (it starts reading from the current file position, it doesn't rewind to the beginning), so it returns an empty list, and raw_data will just be "[]", with no IPs.

    Just call f.read() once, and assign that to raw_data.

    Then you need to loop over the IPs found with the regexp, not lines.

    import socket
    import re
    
    with open('ipidk.txt' , 'r') as f:
        raw_data = f.read()
    regex = r'(?:\d{1,3}\.){3}\d{1,3}'
    
    foundip = re.findall( regex, raw_data )
    for host in foundip:
        try:
            dns = socket.gethostbyaddr(host)
            print("%s - %s" % (host, dns))
        except socket.error as exc:
            pass