Search code examples
c++boostboost-asio

real-time writes buffer to the disk when using boost-asio


I have a server which is written with by boost.asio. This server gets the file from the client and write it to disk. I have just a problem with that. When server get the file, it write it to disk when it recieved the file completely. I wanted server write the buffer to disk in real-time fashion. For example, server write to disk every 100kb size of the file it get from the client. I have written the following code but I don't know how can I edited to get to this goal.

void Session::DoReadFileContent(size_t arg_bytes_transferred)
{
    if (arg_bytes_transferred > 0)
    {
        m_outputFile.write(m_buffer.data(), static_cast<std::streamsize>(arg_bytes_transferred));

        if (m_outputFile.tellp() >= static_cast<std::streamsize>(m_fileSize))
        {
            std::cout << "Received file: " << m_fileName << std::endl;
            return;
        }
    }

    auto self = shared_from_this();

    m_socket.async_read_some(boost::asio::buffer(m_buffer.data(), m_buffer.size()),
        [this, self](boost::system::error_code arg_error_code, size_t arg_bytes)
        {
            DoReadFileContent(arg_bytes);
        });
}

Solution

  • First off, in that case it seems better to read explicit sizes of data instead of read_some which reads whatever is available.

    In this pattern, it becomes easier to track "remaining bytes receivable" than m_fileSize.

    Here's some minor re-shufflings that made your code into a self-contained example. It expects a server to send a line of text giving the payload size and output filename, followed by the contents of that file. An example server can be run with netcat e.g.:

    (stat -c '%soutput.dat' main.cpp; cat main.cpp) | netcat -l -p 6969
    

    Live On Coliru

    #include <boost/asio.hpp>
    #include <fstream>
    #include <iostream>
    
    using boost::system::error_code;
    using boost::asio::ip::tcp;
    
    struct Session : std::enable_shared_from_this<Session> {
    
        Session(boost::asio::io_context& io, uint16_t port)
         : m_socket(io) 
        {
            m_socket.connect({{}, port});
        }
    
        void Start();
        void DoReadFileContent(size_t transferred = 0);
    
      private:
        std::array<char, 1024> m_buffer;
        std::streamsize m_remainingSize = 0;
        std::string     m_fileName      = "noname.dat";
        std::ofstream   m_outputFile;
    
        tcp::socket m_socket;
    };
    
    void Session::Start() {
        // Reading a size (in text for simplicity) and subsequently receive as many bytes
        //
        // I'm keeping this sync for simplicity, because you probably already have
        // this coded somehwere
        boost::asio::streambuf buf;
        error_code ec;
        auto n = read_until(m_socket, buf, "\n", ec);
    
        std::istream is(&buf);
        if (is >> m_remainingSize && getline(is, m_fileName)) {
            std::cerr << "Protocol trace: n:" << n << ", fileName:" << m_fileName << " payload_size:" << m_remainingSize << "\n";
    
            m_outputFile.exceptions(std::ios::failbit | std::ios::badbit);
            m_outputFile.open(m_fileName, std::ios::binary);
    
            // write excess buffer contents as part of payload
            if (buf.size()) {
                std::cerr << "Writing " << buf.size() << " bytes\n";
                m_remainingSize -= buf.size();
                m_outputFile << &buf;
            }
    
            DoReadFileContent();
        } else {
            std::cerr << "Protocol error, payload_size expected\n";
        }
    }
    void Session::DoReadFileContent(size_t transferred) {
        if (transferred > 0) {
            std::cerr << "Writing " << transferred << " bytes\n";
            m_remainingSize -= transferred;
            m_outputFile.write(m_buffer.data(), transferred);
        }
        if (m_remainingSize <= 0) {
            std::cout << "Completed file: " << m_fileName << std::endl;
            return;
        }
    
        auto self = shared_from_this();
        auto expect = std::min(size_t(m_remainingSize), m_buffer.size());
        std::cout << "Trying to receive next " << expect << " bytes" << std::endl;
        async_read(m_socket,
            boost::asio::buffer(m_buffer.data(), expect),
            [this, self](error_code ec, size_t arg_bytes) {
                std::cerr << "async_read: " << ec.message() << " - " << arg_bytes << " bytes\n";
                if (!ec) {
                    DoReadFileContent(arg_bytes);
                }
            });
    }
    
    int main() {
        boost::asio::io_context io;
    
        std::make_shared<Session>(io, 6868) // download from port 6868
            ->Start();
    
        io.run(); // complete
    }
    

    Testing with

    (stat -c '%soutput.dat' main.cpp; cat main.cpp) | netcat -l -p 6868&
    ./a.out
    md5sum main.cpp output.dat
    

    Prints, e.g.:

    Protocol trace: n:15, fileName:output.dat payload_size:2654
    Trying to receive next 1024 bytes
    async_read: Success - 1024 bytes
    Writing 1024 bytes
    Trying to receive next 1024 bytes
    async_read: Success - 1024 bytes
    Writing 1024 bytes
    Trying to receive next 606 bytes
    async_read: Success - 606 bytes
    Writing 606 bytes
    Completed file: output.dat
    

    The last two lines

    b4eec7203f6a1dcbfbf3d298c7ec0832  main.cpp
    b4eec7203f6a1dcbfbf3d298c7ec0832  output.dat
    

    indicate that the received file is identical to the original.

    Notes:

    • packets are delivered in unspecified sizes, on my system e.g. the same file is received as:

       Protocol trace: n:15, fileName:output.dat payload_size:2654
       Writing 497 bytes
       Trying to receive next 1024 bytes
       async_read: Success - 1024 bytes
       Writing 1024 bytes
       Trying to receive next 1024 bytes
       async_read: Success - 1024 bytes
       Writing 1024 bytes
       Trying to receive next 109 bytes
       async_read: Success - 109 bytes
       Writing 109 bytes
       Completed file: output.dat
       b4eec7203f6a1dcbfbf3d298c7ec0832  main.cpp
       b4eec7203f6a1dcbfbf3d298c7ec0832  output.dat
      

      Note that it starts out with 497 bytes already in the input buffer from the read_until.

    • The protocol is not secure:
      • the file names should be validated. Just imagine what happens if the file would be '/home/sehe/myimportant_file.txt' or worse, say /dev/sde1 and we have permissions to do raw block device access...
      • you might want to specify a amximum size for streambuf, so that if you get a fuzzer that doesn't ever send a '\n' you wouldn't just gobble up all RAM
    • the error handling on file IO is very rough. I used io exceptions, but you probably want to check for m_outputFile.good() instead at various places