When I changed size of ofstream
buffer with pubsetbuf(...)
, everything works fine, except when I put to ofstream
single string longer then 1023
(in the code below). Is it correct behavior or I did something wrong?
int main(){
std::vector<char> rawBuf;
std::ofstream stream;
rawBuf.resize(20000);
stream.rdbuf()->pubsetbuf(&rawBuf[0], 20000);
stream.open("file.txt", std::ios_base::app);
std::string data(1499, 'b');
for(int i = 0; i < 10; i++)
{
stream << data.substr(0, 1024) << "\n"; //1023-length string works great
sleep(1);
}
stream.flush();
stream.close();
return 0;
}
when there is 1024-length string strace ./program
shows something like this:
writev(3, [{iov_base=NULL, iov_len=0}, {iov_base="bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb"..., iov_len=1024}], 2) = 1024
nanosleep({tv_sec=1, tv_nsec=0}, 0x7ffcf3889ac0) = 0
writev(3, [{iov_base="\n", iov_len=1}, {iov_base="bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb"..., iov_len=1024}], 2) = 1025
nanosleep({tv_sec=1, tv_nsec=0}, 0x7ffcf3889ac0) = 0
... and so on 10x
when there is 1023-length string, everything seems ok:
nanosleep({tv_sec=1, tv_nsec=0}, 0x7fff8e13a980) = 0
nanosleep({tv_sec=1, tv_nsec=0}, 0x7fff8e13a980) = 0
... 10x
and then:
write(3, "bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb"..., 10240) = 10240
Why here is single write and earlier is not?
edit:
gcc version 7.3.0 (Ubuntu 7.3.0-16ubuntu3)
basic_streambuf* setbuf(char_type* s, streamsize n) override;
Effects: If
setbuf(0, 0)
is called on a stream before any I/O has occurred on that stream, the stream becomes unbuffered. Otherwise the results are implementation-defined. “Unbuffered” means thatpbase()
andpptr()
always return null and output to the file should appear as soon as possible.
“Implementation-defined” includes “works fine” and “there is only a single write” and other things. In fact, here's what libstdc++ 7.3.0 says:
First, are you sure that you understand buffering? Particularly the fact that C++ may not, in fact, have anything to do with it?
The rules for buffering can be a little odd, but they aren't any different from those of C. (Maybe that's why they can be a bit odd.) Many people think that writing a newline to an output stream automatically flushes the output buffer. This is true only when the output stream is, in fact, a terminal and not a file or some other device -- and that may not even be true since C++ says nothing about files nor terminals. All of that is system-dependent. (The "newline-buffer-flushing only occurring on terminals" thing is mostly true on Unix systems, though.)
Some people also believe that sending endl down an output stream only writes a newline. This is incorrect; after a newline is written, the buffer is also flushed. Perhaps this is the effect you want when writing to a screen -- get the text out as soon as possible, etc -- but the buffering is largely wasted when doing this to a file:
output << "a line of text" << endl; output << some_data_variable << endl; output << "another line of text" << endl;
The proper thing to do in this case to just write the data out and let the libraries and the system worry about the buffering. If you need a newline, just write a newline:
output << "a line of text\n" << some_data_variable << '\n' << "another line of text\n";
I have also joined the output statements into a single statement. You could make the code prettier by moving the single newline to the start of the quoted text on the last line, for example.
If you do need to flush the buffer above, you can send an
endl
if you also need a newline, or just flush the buffer yourself:output << ...... << flush; // can use std::flush manipulator output.flush(); // or call a member fn
On the other hand, there are times when writing to a file should be like writing to standard error; no buffering should be done because the data needs to appear quickly (a prime example is a log file for security-related information). The way to do this is just to turn off the buffering before any I/O operations at all have been done (note that opening counts as an I/O operation):
std::ofstream os; std::ifstream is; int i; os.rdbuf()->pubsetbuf(0,0); is.rdbuf()->pubsetbuf(0,0); os.open("/foo/bar/baz"); is.open("/qux/quux/quuux"); ... os << "this data is written immediately\n"; is >> i; // and this will probably cause a disk read
Since all aspects of buffering are handled by a
streambuf
-derived member, it is necessary to get at that member withrdbuf()
. Then the public version ofsetbuf
can be called. The arguments are the same as those for the Standard C I/O Library function (a buffer area followed by its size).A great deal of this is implementation-dependent. For example,
streambuf
does not specify any actions for its ownsetbuf()
-ish functions; the classes derived fromstreambuf
each define behavior that "makes sense" for that class: an argument of(0,0)
turns off buffering forfilebuf
but does nothing at all for its siblingsstringbuf
andstrstreambuf
, and specifying anything other than(0,0)
has varying effects. User-defined classes derived fromstreambuf
can do whatever they want. (Forfilebuf
and arguments for(p,s)
other than zeros, libstdc++ does what you'd expect: the firsts
bytes ofp
are used as a buffer, which you must allocate and deallocate.)A last reminder: there are usually more buffers involved than just those at the language/library level. Kernel buffers, disk buffers, and the like will also have an effect. Inspecting and changing those are system-dependent.