Search code examples
pythontcpstream

How to achieve tcpflow functionality (follow tcp stream) purely within python


I am writing a tool in python (platform is linux), one of the tasks is to capture a live tcp stream and to apply a function to each line. Currently I'm using

import subprocess
proc = subprocess.Popen(['sudo','tcpflow', '-C', '-i', interface, '-p', 'src', 'host', ip],stdout=subprocess.PIPE)

for line in iter(proc.stdout.readline,''):
    do_something(line)

This works quite well (with the appropriate entry in /etc/sudoers), but I would like to avoid calling an external program.

So far I have looked into the following possibilities:

  • flowgrep: a python tool which looks just like what I need, BUT: it uses pynids internally, which is 7 years old and seems pretty much abandoned. There is no pynids package for my gentoo system and it ships with a patched version of libnids which I couldn't compile without further tweaking.

  • scapy: this is a package manipulation program/library for python, I'm not sure if tcp stream reassembly is supported.

  • pypcap or pylibpcap as wrappers for libpcap. Again, libpcap is for packet capturing, where I need stream reassembly which is not possible according to this question.

Before I dive deeper into any of these libraries I would like to know if maybe someone has a working code snippet (this seems like a rather common problem). I'm also grateful if someone can give advice about the right way to go.

Thanks


Solution

  • Jon Oberheide has led efforts to maintain pynids, which is fairly up to date at: http://jon.oberheide.org/pynids/

    So, this might permit you to further explore flowgrep. Pynids itself handles stream reconstruction rather elegantly.See http://monkey.org/~jose/presentations/pysniff04.d/ for some good examples.