python multithreading network-programming scapy sniffing

sending packet and sniffing in same python code

I need to setup connection with different websites from the list. Send some packet and sniff packet for just that website till I don't go for the next website (iteration). When I goes to next iteration(website) I want to sniff and filter for that address only. Can I achieve that within a single python code?

sniff(filter="ip and host " + ip_addr,prn=print_summary)
req = "GET / HTTP/1.1\r\nHost: "+ website +"\r\nConnection: keep-alive\r\nCache-Control: max-age=0\r\nUpgrade-Insecure-Requests: 1\r\nUser-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Ubuntu Chromium/58.0.3029.110 Chrome/58.0.3029.110 Safari/537.36\r\nAccept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8\r\nAccept-Language: en-US,en;q=0.8\r\n\r\n"
url = (website, 80)
c = socket.socket(socket.AF_INET, socket.SOCK_STREAM, proto=socket.IPPROTO_TCP)
c.settimeout(5.0)
c.connect(url)
c.setsockopt(socket.SOL_IP, socket.IP_TTL, i)
c.send(req)
print str(c.recv(4096))
c.close()

I am running the above code in loop. But during its first run it stucks in sniff function. Can anyone help me with this?

Solution

OK I've edited the answer.

Sniffing packets for a single website isn't easy, as the Berkley Packet Filter syntax used by scrapy doesn't have a simple option for HTTP. See this question for some suggestions on the options available.

One possibility is to sniff the TCP packets to/from your web proxy server; I have done this in the code sample below, which saves the TCP packets for a list of different URLs to individual named files. I haven't put in any logic to detect when the page load finishes, I just used a 60 second timeout. If you want something different then you can use this as a starting point. If you don't have a proxy server to sniff then you'll need to change the bpf_filter variable.

NB if you want to save the raw packet data, instead of the converted-to-string version, then modify the relevant line (which is commented in the code.)

from scapy.all import *
import urllib
import urlparse
import threading
import re

proxy   = "http://my.proxy.server:8080"
proxyIP = "1.2.3.4" # IP address of proxy

# list of URLs
urls = ["http://www.bbc.co.uk/news",
        "http://www.google.co.uk"]

packets = []

# packet callback
def pkt_callback(pkt):
    packets.append(pkt) # save the packet

# monitor function
def monitor(fname):
    del packets[:]
    bpf_filter = "tcp and host " + proxyIP       # set this filter to capture the traffic you want
    sniff(timeout=60, prn=pkt_callback, filter=bpf_filter, store=0)
    f=open(fname+".data", 'w')
    for pkt in packets:
        f.write(repr(pkt))  # or just save the raw packet data instead
        f.write('\n')
    f.close()

for url in urls:
    print "capturing: " + url
    mon = threading.Thread(target=monitor, args=(re.sub(r'\W+', '', url),))
    mon.start()
    data = urllib.urlopen(url, proxies={'http': proxy})
    # this line gets IP address of url host, might be helpful 
    # addr = socket.gethostbyname(urlparse.urlparse(data.geturl()).hostname)
    mon.join()

Hope this gives you a good starting point.