Search code examples
pythonmultithreadingnetwork-programmingscapysniffing

sending packet and sniffing in same python code


I need to setup connection with different websites from the list. Send some packet and sniff packet for just that website till I don't go for the next website (iteration). When I goes to next iteration(website) I want to sniff and filter for that address only. Can I achieve that within a single python code?

sniff(filter="ip and host " + ip_addr,prn=print_summary)
req = "GET / HTTP/1.1\r\nHost: "+ website +"\r\nConnection: keep-alive\r\nCache-Control: max-age=0\r\nUpgrade-Insecure-Requests: 1\r\nUser-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Ubuntu Chromium/58.0.3029.110 Chrome/58.0.3029.110 Safari/537.36\r\nAccept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8\r\nAccept-Language: en-US,en;q=0.8\r\n\r\n"
url = (website, 80)
c = socket.socket(socket.AF_INET, socket.SOCK_STREAM, proto=socket.IPPROTO_TCP)
c.settimeout(5.0)
c.connect(url)
c.setsockopt(socket.SOL_IP, socket.IP_TTL, i)
c.send(req)
print str(c.recv(4096))
c.close()

I am running the above code in loop. But during its first run it stucks in sniff function. Can anyone help me with this?


Solution

  • OK I've edited the answer.

    Sniffing packets for a single website isn't easy, as the Berkley Packet Filter syntax used by scrapy doesn't have a simple option for HTTP. See this question for some suggestions on the options available.

    One possibility is to sniff the TCP packets to/from your web proxy server; I have done this in the code sample below, which saves the TCP packets for a list of different URLs to individual named files. I haven't put in any logic to detect when the page load finishes, I just used a 60 second timeout. If you want something different then you can use this as a starting point. If you don't have a proxy server to sniff then you'll need to change the bpf_filter variable.

    NB if you want to save the raw packet data, instead of the converted-to-string version, then modify the relevant line (which is commented in the code.)

    from scapy.all import *
    import urllib
    import urlparse
    import threading
    import re
    
    proxy   = "http://my.proxy.server:8080"
    proxyIP = "1.2.3.4" # IP address of proxy
    
    # list of URLs
    urls = ["http://www.bbc.co.uk/news",
            "http://www.google.co.uk"]
    
    packets = []
    
    # packet callback
    def pkt_callback(pkt):
        packets.append(pkt) # save the packet
    
    # monitor function
    def monitor(fname):
        del packets[:]
        bpf_filter = "tcp and host " + proxyIP       # set this filter to capture the traffic you want
        sniff(timeout=60, prn=pkt_callback, filter=bpf_filter, store=0)
        f=open(fname+".data", 'w')
        for pkt in packets:
            f.write(repr(pkt))  # or just save the raw packet data instead
            f.write('\n')
        f.close()
    
    for url in urls:
        print "capturing: " + url
        mon = threading.Thread(target=monitor, args=(re.sub(r'\W+', '', url),))
        mon.start()
        data = urllib.urlopen(url, proxies={'http': proxy})
        # this line gets IP address of url host, might be helpful 
        # addr = socket.gethostbyname(urlparse.urlparse(data.geturl()).hostname)
        mon.join()
    

    Hope this gives you a good starting point.