Search code examples
wiresharktcpdumptshark

Recursively Filter directory of .cap/.pcap Files With tshark


I am trying to apply a Wireshark filter to a directory of .cap files created by tcpdump. I have about a 1000 .cap files awaiting filtering (we are intentionally capturing without filters to ensure we have all the data we need in case our hypothesis doesn't pan out). I cannot find any references to folks using tshark to recursively read a file, apply the filter, write out a new .cap file, then move on to the next, rinse, repeat.

My set up:

Tcpdump is dumping traffic and rolling to a new file once the file reaches 1GB (yes, huge for pcaps). Just for reference, this is the tcpdump command I'm using:

sudo tcpdump -q -i <INTERFACE> -w path/to/capfile.cap -C 1000 -Z root

I can use tshark to apply a filter to a given .cap file and have it output to a new .cap file no problem using the following command:

tshark -R <FILTER> -r in.cap0001 -w out.cap0001

Tshark main page states,

"-r ...It is possible to use named pipes or stdin (-) here..."

but I am by no means an expert with named pipes, stdin, nor am I programmer.

Could someone point me in the right direction? Thanks!


Solution

  • I think you can achieve this directly with some shell commands.

    Try the following, go to the directory where you have your captures and execute the following command:

    ls | grep '\.cap$' | while read f; do (tshark -R <FILTER> -r $f -w mod_$f); done

    This will produce new .cap files with the desired filter applied. Needless to say, this command can be properly tweaked to fit your personal needs, but it is a nice starting point.


    Commands explanation:

    ls: List all files contained in the current directory.

    |: Pipe, the standard output of left hand side command is used as standard input of the right hand side command.

    grep '\.cap$': Ensures you work only with files ending with .cap will be matched. As per your comment, note that if the files do not end exactly with .cap this filter should be changed to grep '\.cap' since $ tells grep that the line ends with whatever precedes it (in this example the ".cap" string).

    while read f: Reads each line returned by the previous commands.

    do (<COMMAND>): For each line read, it does the COMMAND, which in this case is your tshark command.

    done: Part of the while command syntax.


    Example command execution:

    # ls
    fake.cap2  non_cap_file.txt  out2.cap  out.cap
    
    # ls | grep '\.cap$'
    out2.cap
    out.cap
    

    The following is the only line that you actually need to execute, the others are here just to illustrate the folder contents before and after the command execution.

    # ls | grep '\.cap$' | while read f; do (tshark -R <FILTER> -r $f -w mod_$f); done
    Running as user "root" and group "root". This could be dangerous.
    Running as user "root" and group "root". This could be dangerous.
    

    On a side note, I got this root warning because I did this quick test using root user...

    # ls
    fake.cap2  mod_out2.cap  mod_out.cap  non_cap_file.txt  out2.cap  out.cap
    
    # ls | grep '\.cap$'
    mod_out2.cap
    mod_out.cap
    out2.cap
    out.cap
    

    As you can see it grabs every .cap file, applies the filter specified and writes a new mod_*.cap file.