Search code examples
c++pcappcapplusplus

Can you index a PCAP file without loading it all into memory?


I have to look at PCAPs that are quite large, around 40GB. What I'm doing right now is using PCAP++ to parse the PCAPs one at a time and process the data inside them. That data is placed into a buffer for it to be viewed. To save memory, I throw out the old data as you continue through the PCAP. This allows me to only use about 150MBs at a time. However, if the user wants to go back and view the data too far back, they can't because it's been thrown out.

Is there any way I can look at the PCAP file and go to the packets in which the data was stored and reprocess the data if the user wants to look back? It seems that if I want to get certain packets I would have to reload the file and look through it all again for each and every single section of data or split up the pcap file into a ton of bite size chunks.


Solution

  • So I figured it out and short answer is no. PCAP++ doesn't support any functionality that could index, or mimic indexing, on a pcap file. I switched back to libpcap, (this also should work in windows with winpcap but I haven't tested it yet) in order to use a different library to help sort out what needed to be done. To do this properly you need to use file pointers that point to the critical packets (or all packets, depending on what you want) in your pcap file. Here's how it works:

    #ifdef _MSC_VER
        pcap_t *pcap = pcap_open_offline((pcapPath.string()).c_str(), errbuf);
    #else
        pcap_t *pcap = pcap_open_offline(pcapPath.c_str(), errbuf);
    #endif
    
    //-----------------------------
    //... General Packet setup ...
    //-----------------------------
    
    vector<fpos_t*> pcapIndexer;
    while(/*Get Next Packet*/){
    
        //Parse the packet and get required flag
        //to know if it is a critical packet
    
        if(/*Check some condition from the data*/){
            fpos_t* position;
    #ifdef _MSC_VER
            pcap_fgetpos(pcap, position);
    #else
            FILE* f = pcap_file(pcap);
            fgetpos(f, position);
    #endif
            pcapIndexer.push_back(position);
        }
    }
        
    

    From the above code, you would go through each packet and depending on the data in the packet, you would add that file pointer to the vector of file pointers. Then once you need to load a packet, you create a new pcap_t pointer and use this:

    #ifdef _MSC_VER
        pcap_fsetpos(pcap, pcapIndexer[x]);
    #else
        FILE* f = pcap_file(pcap);
        fsetpos(f, pcapIndexer[x]);
    #endif
    

    You can then read as normal from that point. Note that you will have to run through the entire pcap file in order to fill out this vector of file pointers however if you use multi threading you can use some of the data that is in the earlier packets while the rest of the pcap file is being read through and added to the vector. This allows you to load in a select amount of data and then be able to jump back and forth to grab data from elsewhere in the pcap file. I hope this helps someone else in this situation.