Search code examples
boostbufferethernetgnuradio

How do I get a specific byte or bytes from a streambuf?


For receiving a raw protocol with custom headers Ethernet frame , I am reading in the bytes from Ethernet using a streambuf buffer. The payload gets copied successfully for the most part, but I need to check a specific byte of the frame header in the buffer so I can handle certain corner cases, but unable to figure out how to get the specific byte, and how to get it into an integer. Here is the code:

boost::asio::streambuf read_buffer;

boost::asio::streambuf::mutable_buffers_type buf = read_buffer.prepare(bytesToGet);
bytesRead = d_socket10->receive(boost::asio::buffer(buf, bytesToGet));
read_buffer.commit(bytesRead);

const char *readData = boost::asio::buffer_cast<const char*>( read_buffer.data() + 32 ); 

I need to get the length byte that would be at address 20. I've tried doing stuff with stringstream, memcpy and casting, but I don't have a handle on that, either getting compile errors or its not doing what I thought it should do.

How can I get the byte from the offset I need and cast it to a byte or short? The size is actually 2 bytes, but in this specific case, one of those bytes should be zero, so either getting 1 byte or 2 bytes would be ideal.

Thanks!


Solution

  • Welcome to parsing.

    Welcome to binary data.

    Welcome to portable network protocols.

    Each of these three subjects are their own thing to get a handle on.

    The simplest thing would be to read into a buffer and use that. Use Boost Endian to remove portability concerns.

    Here's the simplest thing I can think of using just standard library things (ignoring endianness):

    Live On Coliru

    #include <boost/asio.hpp>
    #include <istream>
    #include <iostream>
    
    namespace ba = boost::asio;
    
    void fill_testdata(ba::streambuf&);
    
    int main() {
        ba::streambuf sb;
        fill_testdata(sb);
    
        // parsing starts here
        char buf[1024];
        std::istream is(&sb);
        // read first including bytes 20..21:
        is.read(buf, 22);
        size_t actual = is.gcount();
    
        std::cout << "stream ok? " << std::boolalpha << is.good() << "\n";
        std::cout << "actual: " << actual << "\n";
        if (is && actual >= 22) { // stream ok, and not a short read
            uint16_t length = *reinterpret_cast<uint16_t const*>(buf + 20);
            std::cout << "length: " << length << "\n";
    
            std::string payload(length, '\0');
            is.read(&payload[0], length);
            actual = is.gcount();
    
            std::cout << "actual payload bytes: " << actual << "\n";
            std::cout << "stream ok? " << std::boolalpha << is.good() << "\n";
            payload.resize(actual);
    
            std::cout << "payload: '" << payload << "'\n";
        }
    }
    
    // some testdata
    void fill_testdata(ba::streambuf& sb) 
    {
        char data[] = { 
            '\x00', '\x00', '\x00', '\x00', '\x00', // 0..4
            '\x00', '\x00', '\x00', '\x00', '\x00', // 5..9
            '\x00', '\x00', '\x00', '\x00', '\x00', // 10..14
            '\x00', '\x00', '\x00', '\x00', '\x00', // 15..19
            '\x0b', '\x00', 'H'   , 'e'   , 'l'   , // 20..24
            'l'   , 'o'   , ' '   , 'w'   , 'o'   , // 25..29
            'r'   , 'l'   , 'd'   , '!'   ,         // 30..33
        };
        std::ostream(&sb).write(data, sizeof(data));
    }
    

    Prints

    stream ok? true
    actual: 22
    length: 11
    actual payload bytes: 11
    stream ok? true
    payload: 'Hello world'
    

    Increase \x0b to \x0c to get:

    stream ok? true
    actual: 22
    length: 12
    actual payload bytes: 12
    stream ok? true
    payload: 'Hello world!'
    

    Increasing it to more than is in the buffer, like '\x0d gives a failed (partial) read:

    stream ok? true
    actual: 22
    length: 13
    actual payload bytes: 12
    stream ok? false
    payload: 'Hello world!'
    

    Let's Go Pro

    To go pro, I'd use a library like e.g. Boost Spirit. This understands about endianness, does validations and really shines when you get branches in your parser, like

     record = compressed_record | uncompressed_record;
    

    Or

     exif_tags = .... >> custom_attrs;
    
     custom_attr  = attr_key >> attr_value;
     custom_attrs = repeat(_ca_count) [ custom_attrs ];
    
     attr_key = bson_string(64);     // max 64, for security
     attr_value = bson_string(1024); // max 1024, for security
    
     bson_string %= omit[little_dword[_a=_1]] 
                 >> eps(_a<=_r) // not exceeding maximum
                 >> repeat(_a) [byte_];
    

    But that's noodling far ahead. Let's do a much simpler demo:

    Live On Coliru ¹

    #include <boost/asio.hpp>
    
    #include <istream>
    #include <iostream>
    
    namespace ba = boost::asio;
    
    void fill_testdata(ba::streambuf&);
    
    struct FormatData {
        std::string signature, header; // e.g. 4 + 16 = 20 bytes - could be different, of course
        std::string payload;           // 16bit length prefixed
    };
    
    FormatData parse(std::istream& is);
    
    int main() {
        ba::streambuf sb;
        fill_testdata(sb);
    
        try {
            std::istream is(&sb);
            FormatData data = parse(is);
    
            std::cout << "actual payload bytes: " << data.payload.length() << "\n";
            std::cout << "payload: '" << data.payload << "'\n";
        } catch(std::runtime_error const& e) {
            std::cout << "Error: " << e.what() << "\n";
        }
    }
    
    // some testdata
    void fill_testdata(ba::streambuf& sb) 
    {
        char data[] = { 
            'S'   , 'I'   , 'G'   , 'N'   , '\x00'   , // 0..4
            '\x00', '\x00', '\x00', '\x00', '\x00'   , // 5..9
            '\x00', '\x00', '\x00', '\x00', '\x00'   , // 10..14
            '\x00', '\x00', '\x00', '\x00', '\x00'   , // 15..19
            '\x0b', '\x00', 'H'   , 'e'   , 'l'      , // 20..24
            'l'   , 'o'   , ' '   , 'w'   , 'o'      , // 25..29
            'r'   , 'l'   , 'd'   , '!'   , // 30..33
        };
        std::ostream(&sb).write(data, sizeof(data));
    }
    
    //#define BOOST_SPIRIT_DEBUG
    #include <boost/fusion/adapted/struct.hpp>
    #include <boost/spirit/include/qi.hpp>
    #include <boost/spirit/include/phoenix.hpp>
    namespace qi = boost::spirit::qi;
    
    BOOST_FUSION_ADAPT_STRUCT(FormatData, signature, header, payload)
    
    template <typename It>
    struct FileFormat : qi::grammar<It, FormatData()> {
        FileFormat() : FileFormat::base_type(start) {
            using namespace qi;
    
            signature  = string("SIGN");     // 4 byte signature, just for example
            header     = repeat(16) [byte_]; // 16 byte header, same
    
            payload   %= omit[little_word[_len=_1]] >> repeat(_len) [byte_];
            start      = signature >> header >> payload;
    
            //BOOST_SPIRIT_DEBUG_NODES((start)(signature)(header)(payload))
        }
      private:
        qi::rule<It, FormatData()> start;
        qi::rule<It, std::string()> signature, header;
    
        qi::_a_type _len;
        qi::rule<It, std::string(), qi::locals<uint16_t> > payload;
    };
    
    FormatData parse(std::istream& is) {
        using it = boost::spirit::istream_iterator;
    
        FormatData data;
        it f(is >> std::noskipws), l;
        bool ok = parse(f, l, FileFormat<it>{}, data);
    
        if (!ok)
            throw std::runtime_error("parse failure\n");
    
        return data;
    }
    

    Prints:

    actual payload bytes: 11
    payload: 'Hello world'
    

    ¹ What a time to be alive! Coliru swamped and wandbox down, simultaneously! Had to remove Boost Asio for the online demo because IdeOne doesn't link Boost System