I'm working on an HTML form processor in C++, mainly as a learning experience. I have a little output buffer class to allow me to send the Content-Length
header. It works fine until I try reading in and outputting a template file. It's on a Windows system, so the lines are of course terminated with \r\n
, but when I use the length()
method on my buffer string, it's not counting both characters, and my Content-Length
ends up short. I tried reading the file both with and without ios::binary
, and it makes no difference.
[EDIT]
OK, sorry, here is minimal code which reproduces the problem:
#include <iostream>
#include <fstream>
#include <sys/stat.h>
using namespace std;
size_t fileSize(const char* filename) {
struct stat st;
if(stat(filename, &st) != 0) return 0;
return st.st_size;
}
int main() {
char fName[] = "testack.html";
char oName[] = "testout.txt";
int _size;
char *_content;
ifstream inFile;
inFile.open(fName, ios::binary);
if (inFile.good()) {
_size = fileSize(fName);
_content = new char[_size + 1];
inFile.read(_content, _size);
_content[_size] = 0;
}
ofstream os(oName);
os << _content;
return 0;
}
And here is the test file:
<HTML><BODY>Hello World!</BODY></HTML>
That is 38 bytes, and Windows and my program and everyone agrees, and I end up with 38 bytes in testout.txt
Now, if I add a single line break:
<HTML>
<BODY>Hello World!</BODY></HTML>
Windows says it's 40 bytes (as I would expect), my program reads 40 bytes, and I end up with 41 bytes in the output file. With a second line break:
<HTML>
<BODY>
Hello World!</BODY></HTML>
Windows says 42 bytes, my program reads 42, and I end up with 44 in the output file. So, it appears that an extra byte is being added to each line break when I output it, whether to a file or to stdout
. At this point I'm completely confused. Any ideas?
[EDIT]
And, with a little more testing I discovered that an extra \r is being added to each line, thus I have, for example:
<HTML>\r\r\n
stdout
in Binary ModeAs indicated by my edits and comments above, the problem was not at all with string.length()
, but rather with Windows converting all \n
to \r\n
when sending to stdout
. It even does this with existing \r\n
sequences, turning them into \r\r\n
. Thank you, Microsoft, for always knowing so much better than me what I really want to do.
My first solution, to convert all \r\n
to \n
before outputting (so that when Windows converted them back to \r\n
the byte count would be correct) really was not an ideal solution, as it only addressed files being read and output, and anything output directly by the program was again causing the byte count to be off. Of course, I could have just appended \r\n
to all my output (only to strip it and then have Windows put it back), but that seemed a bit...kludgey. After a good night's sleep and more thought and reading, I decided that forcing Windows to keep its hands off my bytes was the better solution -- to change stdout
to binary mode.
However, the question that BoundaryImposition linked to did not have all the information I needed. So, after much googling and reading, here for posterity is the complete solution I settled on:
#if defined(_WIN32) || defined(_WIN64)
#include <io.h>
#include <fcntl.h>
#endif
int main() {
#if defined(_WIN32) || defined(_WIN64)
setmode(fileno(stdout), O_BINARY);
#endif
}
Thank you to BoundaryImposition and to everyone else for your help and for continuing to beat me over the head with what I really needed to do, until it finally stuck.