Search code examples
c++boostwebsocketboost-beast

Receiving large binary data over Boost::Beast websocket


I am trying to receive a large amount of data using a boost::beast::websocket, fed by another boost::beast::websocket. Normally, this data is sent to a connected browser but I'd like to set up a purely C++ unit test validating certain components of the traffic. I set the auto fragmentation to true from the sender with a max size of 1MB but after a few messages, the receiver spits out:

Read 258028 bytes of binary
Read 1547176 bytes of binary
Read 168188 bytes of binary
"Failed read: The WebSocket message exceeded the locally configured limit"

Now, I should have no expectation that a fully developed and well supported browser should exhibit the same characteristics as my possibly poorly architected unit test, which it does not. The browser has no issue reading 25MB messages over the websocket. My boost::beast::websocket on the other hand hits a limit.

So before I go down a rabbit hole, I'd like to see if anyone has any thoughts on this. My read sections looks like this:

void on_read(boost::system::error_code ec, std::size_t bytes_transferred)
{
    boost::ignore_unused(bytes_transferred);

    if (ec)
    {
        m_log.error("Failed read: " + ec.message());
        // Stop the websocket
        stop();
        return;
    }

    std::string data(boost::beast::buffers_to_string(m_buffer.data()));

    // Yes I know this looks dangerous. The sender always sends as binary but occasionally sends JSON 
    if (data.at(0) == '{')
        m_log.debug("Got message: " + data);
    else
        m_log.debug("Read " + utility::to_string(m_buffer.data().buffer_bytes()) + " of binary data");

    // Do the things with the incoming doata
    for (auto&& callback : m_read_callbacks)
        callback(data);

    // Toss the data
    m_buffer.consume(bytes_transferred);

    // Wait for some more data
    m_websocket.async_read(
        m_buffer,
        std::bind(
            &WebsocketClient::on_read,
            shared_from_this(),
            std::placeholders::_1,
            std::placeholders::_2));
}

I saw in a separate example that instead of doing an async read, you can do a for/while loop reading some data until the message is done (https://www.boost.org/doc/libs/1_67_0/libs/beast/doc/html/beast/using_websocket/send_and_receive_messages.html). Would this be the right approach for an always open websocket that could send some pretty massive messages? Would I have to send some indicator to the client that the message is indeed done? And would I run into the exceeded buffer limit issue using this approach?


Solution

  • If your use pattern is fixed:

    std::string data(boost::beast::buffers_to_string(m_buffer.data()));
    

    And then, in particular

        callback(data);
    

    Then there will be no use at all reading block-wise, since you will be allocating the same memory anyways. Instead, you can raise the "locally configured limit":

    ws.read_message_max(20ull << 20); // sets the limit to 20 miB
    

    The default value is 16 miB (as of boost 1.75).

    Side Note

    You can probably also use ws.got_binary() to detect whether the last message received was binary or not.