Search code examples
pythonwiresharkpcaptcpdumpwinpcap

Find how much data has been transferred from pcap data


I have a pcap file which contains traffic trace from an experiment in a binary format. What I'm trying to do is to find out how much data different hosts are exchanging with each other, but I'm pretty new to working with pcap and I've been searching and trying different things without success.

Can tcpdump be useful here? I've processed the raw file with it and got something like this:

2009-12-17 17:26:04.398500 IP 41.177.117.184.1618 > 41.177.3.224.51332: Flags [P.], seq 354231048:354231386, ack 3814681859, win 65535, length 338
2009-12-17 17:26:04.398601 IP 90.218.72.95.10749 > 244.3.160.239.80: Flags [P.], seq 1479609190:1479610159, ack 3766710729, win 17520, length 969
2009-12-17 17:26:04.398810 IP 244.3.160.239.80 > 90.218.72.95.10749: Flags [.], ack 969, win 24820, length 0
2009-12-17 17:26:04.398879 IP 41.177.3.224.51332 > 41.177.117.184.1618: Flags [P.], seq 1:611, ack 338, win 65535, length 610

Are the "length" values at the end of each line good indicators of how much data two hosts have transferred to each other?

The problem is that if I look at the raw file with Wireshark it seems like this length is actually the TCP header length, however the data/payload size is specified separately in Wireshark (38 bytes for the first of these four packets) which is confusing me.

So to sum up - Wireshark says (about the first packet):1) "396 bytes on wire", 2) "96 bytes captured", 3) "len: 338", 4) "Data (38 bytes)".

Tcpdump says: "length 338"

How do I find payload size? I'm willing to use Python if possible as I'll be working with a huge capture file.


Solution

  • Can tcpdump be useful here?

    Yes.

    Are the "length" values at the end of each line good indicators of how much data two hosts have transferred to each other?

    Yes. That's the amount of bytes transferred sans headers.

    How do I find payload size? I'm willing to use Python if possible as I'll > be working with a huge capture file.

    You didn't specify a protocol so let's assume by "payload size" you mean "everything after the IP header". This is easy to do with Python and dpkt. As per Jon's tutorial, assuming IP packets with no options, some code that does probably what you want looks like this:

    #!/usr/bin/env python
    
    import dpkt
    from socket import inet_ntoa
    
    with open("sample.pcap") as f:
        pcap = dpkt.pcap.Reader(f)
        for ts, buf in pcap:
            ip = dpkt.ethernet.Ethernet(buf).data
            print "{} --> {} {}".format(inet_ntoa(ip.src), inet_ntoa(ip.dst), ip.len)