I am writing a tool in python (platform is linux), one of the tasks is to capture a live tcp stream and to apply a function to each line. Currently I'm using
import subprocess
proc = subprocess.Popen(['sudo','tcpflow', '-C', '-i', interface, '-p', 'src', 'host', ip],stdout=subprocess.PIPE)
for line in iter(proc.stdout.readline,''):
do_something(line)
This works quite well (with the appropriate entry in /etc/sudoers), but I would like to avoid calling an external program.
So far I have looked into the following possibilities:
flowgrep: a python tool which looks just like what I need, BUT: it uses pynids internally, which is 7 years old and seems pretty much abandoned. There is no pynids package for my gentoo system and it ships with a patched version of libnids which I couldn't compile without further tweaking.
scapy: this is a package manipulation program/library for python, I'm not sure if tcp stream reassembly is supported.
pypcap or pylibpcap as wrappers for libpcap. Again, libpcap is for packet capturing, where I need stream reassembly which is not possible according to this question.
Before I dive deeper into any of these libraries I would like to know if maybe someone has a working code snippet (this seems like a rather common problem). I'm also grateful if someone can give advice about the right way to go.
Thanks
Jon Oberheide has led efforts to maintain pynids, which is fairly up to date at: http://jon.oberheide.org/pynids/
So, this might permit you to further explore flowgrep. Pynids itself handles stream reconstruction rather elegantly.See http://monkey.org/~jose/presentations/pysniff04.d/ for some good examples.