Search code examples
cprotocolsmsgpack

knowing end of message in protocol with msgpack


How do you manage end of a message in a protocol ? I use msgpack-c and the only solution I found is to send the header before the payload (separately).

Send the header to client :

// header
{
  "message_type": "hello",
  "payload_size": 10
}

The client received the header, unpack it, and allocate a buffer of "payload_size", receive data from stream, and if the buffer is complete the message is finish.

I want to send header and body succinctly

{
  "header": { "message_type":"hello", "payload_size": 10},
  "payload": {...} // can come in multiple frame
}

My problem is that I don't know if it's possible to partially unpack the header for knowing the size before receiving the full message (splitted if > 4096kb due to libevent restriction)

How would you do that ? I am open to all solutions.


Solution

  • C++

    Using unpack() function

    You can use offset parameter of unpack() function. See https://github.com/msgpack/msgpack-c/wiki/v2_0_cpp_unpacker#client-controls-a-buffer

    Here is a code example:

    #include <iostream>
    #incluee <msgpack.hpp>
    
    int main() {
        msgpack::sbuffer buf;
        msgpack::pack(buf, std::make_tuple("first message", 123, 56.78));
        msgpack::pack(buf, std::make_tuple("second message", 42));
        std::size_t off = 0; // cursor of buf
        {
            // unpack() function starts parse from off (0)
            auto oh = msgpack::unpack(buf.data(), buf.size(), off);
            // off is updated to 25. 25 is MessagePack formatted byte size
            // of ["first message",123,56.78] 
            // (I use JSON notation but actual format is MessagePack)
            std::cout << "off:" << off << std::endl;
            std::cout << *oh << std::endl;
        }
        {
            // unpack() function starts parse from off (25)
            auto oh = msgpack::unpack(buf.data(), buf.size(), off);
            // off is updated to 42. 
            // 42 - 25 = 17. 17 is MessagePack formatted byte size
            // of ["second message",42] 
            // (I use JSON notation but actual format is MessagePack)
            std::cout << "off:" << off << std::endl;
            std::cout << *oh << std::endl;
        }
    }
    

    Output is

    off:25
    ["first message",123,56.78]
    off:42
    ["second message",42]
    

    msgpack-c unpack() manage the position of buffer internally. You don't need to pass payload_size.

    In addition you can mix non-msgpack format data in the buffer.

    +--------------------+-----------------------------+--------------------+
    | MessagePackBytes1  | Any format user knows size  | MessagePackBytes2  |
    +--------------------+-----------------------------+--------------------+
    

    Let's say user knows the data structure that contains MessgePackBytes1(unknown size), any format data (known size), and MessgePackBytes1(unknown size).

    #include <iostream>
    #incluee <msgpack.hpp>
    
    int main() {
        msgpack::sbuffer buf;
        msgpack::pack(buf, std::make_tuple("first message", 123, 56.78));
        std::string non_mp = "non mp format data";
        buf.write(non_mp.data(), non_mp.size());
        msgpack::pack(buf, std::make_tuple("second message", 42));
        std::size_t off = 0; // cursor of buf
        {
            auto oh = msgpack::unpack(buf.data(), buf.size(), off);
            std::cout << "off:" << off << std::endl;
            std::cout << *oh << std::endl;
        }
        {
            std::string extracted{buf.data() + off, non_mp.size()};
            std::cout << extracted << std::endl;
            off += non_mp.size();
        }
        {
            auto oh = msgpack::unpack(buf.data(), buf.size(), off);
            std::cout << "off:" << off << std::endl;
            std::cout << *oh << std::endl;
        }
    }
    

    Output is

    off:25
    ["first message",123,56.78]
    non mp format data
    off:60
    ["second message",42]
    

    Using unpacker

    It is a little advanced but it might fit streaming usecases. https://github.com/msgpack/msgpack-c/wiki/v2_0_cpp_unpacker#msgpack-controls-a-buffer Here is an example that unpacking MessagePack from continuous and scattered receiving message. https://github.com/msgpack/msgpack-c/blob/700167995927f0348fb08ae2579440c1bc135480/example/boost/asio_send_recv.cpp#L41-L64

    C

    C version is basically similar to C++.

    Using unpack() function

    C version has the similar unpack function. Here is the prototype:

    msgpack_unpack_return
    msgpack_unpack_next(msgpack_unpacked* result,
            const char* data, size_t len, size_t* off);
    

    You can pass off as offset similar to C++ version. C doesn't have reference so you need to pass the address of off using &off. See https://github.com/msgpack/msgpack-c/wiki/v2_0_c_overview#using-unpack-function

    If you want to know individual variable length field size such as stirng, you can access size member variable of unpacked object. For example:

    typedef struct {
        uint32_t size;
        struct msgpack_object* ptr;
    } msgpack_object_array;
    
    typedef struct {
        uint32_t size;
        const char* ptr;
    } msgpack_object_str;
    
    typedef struct {
        uint32_t size;
        const char* ptr;
    } msgpack_object_bin;
    
    typedef struct {
        int8_t type;
        uint32_t size;
        const char* ptr;
    } msgpack_object_ext;
    
    

    See https://github.com/msgpack/msgpack-c/wiki/v2_0_c_overview#object

    Using unpacker

    See https://github.com/msgpack/msgpack-c/wiki/v2_0_c_overview#using-unpacker