Can a single instance of std::io::BufReader
on a tokio::net::TcpStream
lead to data loss when the BufReader
is used to read_until
a given (byte) delimiter?
That is, is there any possibility that after I use the BufReader
for:
let buffer = Vec::new();
let reader = BufReader::new(tcp_stream);
tokio::io::read_until(reader, delimiter, buffer)
.map(move |(s, _)| s.into_inner())
a following tokio::io::read
using the same stream would return data that is actually beyond the delimiter + 1, causing therefore data loss?
I have an issue (and complete reproducible example on Linux) that I have trouble explaining if the above assumption isn't correct.
I have a TCP server that is supposed to send the content of a file to multiple TCP clients following multiple concurrent requests.
Sometimes, using always the same inputs, the data received by the client is less than expected, therefore the transfer fails.
The error is not raised 100% of the times (that is, some of the client requests still succeed), but with the 100 tries defined in tcp_client.rs
it was always reproducible for at least one of them.
The sequence of data transferred between client and server is composed by:
This issue is only reproducible only if steps 1, 2 and 3 are involved, otherwise it works as expected.
The error is raised when this tokio::io::read
(used to read the file content) returns 0, as if the server closed the connection, even is the server is actually up and running, and all the data has been sent (there is an assertion after tokio::io::copy
and I checked the TCP packets using a packet sniffer). On a side note, in all my runs the amount of data read before the error was always > 95% than the one expected.
Most importantly the common.rs
module defines 2 different read_*
functions:
read_until
currently used.read_exact
not used.The logic of the 2 is the same, they need to read the request/response (and both client and server can be updated to use one or the other). What is surprising is that the bug presents itself only when tokio::io::read_until
is used, while tokio::io::read_exact
works as expected.
Unless, I misused tokio::io::read_until
or there is a bug in my implementation, I expected both versions to work without any issue. What I am seeing instead is this panic being raised because some clients cannot read all the data sent by the server.
Yes. This is described in the documentation for BufReader
(emphasis mine):
When the
BufReader
is dropped, the contents of its buffer will be discarded.
The next sentence is correct but not extensive enough:
Creating multiple instances of a
BufReader
on the same stream can cause data loss.
The BufReader
has read data from the underlying source and put it in the buffer, then you've thrown away the buffer. The data is gone.