Search code examples
linuxunixpcaptcpdumpzcat

Looking for a specyfic message in zipped pcap files


I have around 7200 compressed .pcap files. Each is compressed into a separate .gz file. I need to look for a specific string in packet data details. I would like to write a command to do that. At the moment all I have is:

zcat 20230212*.pcap.gz | tcpdump -qns 0 -X | grep "specyfic string"

where 20230212*.pcap.gz is pattern for these 72000 files.

I know that problem is somewhere on tcpdump part. Sorry for my english.

Update

I tried

tcpdump -qns 0 -A -r filename.pcap | grep "string"

where filename is name of specyfic file, that contains string. It works, but I had to unzip this file. I cannot do it for all files. Also tried:

tcpdump -qns 0 -X -r filename.pcap | grep "string"

but this command cannot find string.

xargs zcat filename.pcap.gz | tcpdump -qns 0 -A -r | grep "string"

gives me: tcpdump: option requires an argument -- 'r'


Solution

  • tcpdump: option requires an argument -- 'r'

    The -r flag needs to be given an argument to indicate what to read.

    An argument of - means "read the standard input", which is what you want here, as you're piping the result of zcat to it.

    So you want

    zcat filename.pcap.gz | tcpdump -qns 0 -A -r - | grep "string"
    

    You don't want xargs, because, with

    xargs zcat filename.pcap.gz | tcpdump -qns 0 -A -r - | grep "string"
    

    it will:

    • read file names from the standard input of the first command - meaning that, if you run that exact command from the command line, it will read file names from the terminal, so you would have to type a bunch of file names, followed by control-D to mark the end of the list of file names;
    • collect the file names into bunches;
    • run zcat filename.pcap.gz {bunch of file names} - meaning that it will decompresss first filename.pcap.gz, followed by all of the files in that bunch, and write the decompressed contents of all those files as a single stream of raw bytes;
    • read more file names and do that again until it runs out of file names;

    which means that what tcpdump will see will look like a bunch of pcap-format files stuck together ("concatenated") into one. That will NOT look like a single pcap-format file to tcpdump; instead, it will look like the first pcap file, followed by a lot of stuff that will not look like valid pcap file contents, so tcpdump will probably print an error and give up.

    (And other programs that read pcap-format files, such as tshark, will do the exact same thing. There's no magic flag or tool to fix that.)

    What you should do, instead, is have a small shell script, such as

    #! /bin/sh
    echo "Processing $1:"
    zcat "$1" | tcpdump -qns 0 -A -r - | grep "$2"
    

    and, to look for a given string in one .pcap.gz file, do

    {path to script} {file name} "string"
    

    where {path to script} is the path name of the script and {file name} is the pathname of the file.

    To scan all the files, do

    for file in 20230212*.pcap.gz
    do
        {path to script} "$file" "string"
    done >/tmp/output
    

    That is a loop that loops over all files that match 20230212*.pcap.gz and, for each of them, runs the script on the file, looking for the string, and sends the output of that entire loop to the file /tmp/output.

    Note that /tmp/output will contain one line for every file, giving the name of the file. If you don't care which capture files contain the string, you can remove the

    echo "Processing $1:"
    

    line from the script. If you do care which capture files contain the string, but you don't care what the exact text that matches is, you can have the script be

    #! /bin/sh
    echo "Processing $1:"
    if zcat "$1" | tcpdump -qns 0 -A -r - | grep -q "$2"
    then
        echo "$1 contains \"$2\""
    fi
    

    which tests whether the grep command found the string and, if it did, prints a message. The -q flag causes grep not to write the matching text out, so the file doesn't have that extra information in it.

    After using: xargs zcat "filename" | tcpdump -qns 0 -X | grep "string, I receive tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on bond0, link-type EN10MB (Ethernet), capture size 262144 bytes

    That's because you didn't provide a -r argument to tcpdump, which means that will capture network traffic from a network interface; because you also didn't specify a -i argument, which would specify an interface from which to capture, it will pick the first interface that shows up in the list it gets from the system, which happened to be bond0 on your system.

    You need to specify -r to get tcpdump to read from a capture file.

    but this command cannot find string.

    That command uses -X, not -A, so it dumped out packet data in a format like this:

        0x0020:  5010 1920 a97a 0000 4854 5450 2f31 2e31  P....z..HTTP/1.1
        0x0030:  2032 3030 204f 4b0d 0a44 6174 653a 2046  .200.OK..Date:.F
        0x0040:  7269 2c20 3236 2041 7567 2032 3030 3520  ri,.26.Aug.2005.
    

    There's no guarantee that the string will all fit on one line.