I'm using winsocks to do my HTTP requests. On my server-side, I run PHP code that gets the content of a file, base64's it, and prints it (echo). On my client-side C++ code, I do a simple HTTP get request. I have verified the problem is not on my server side, rather client side.
Client-side socket code:
locale local;
char buffer[1000000];
int i = 0;
string get_Website(string url, string path = "/", string useragent = "Mozilla") {
string website_HTML;
WSADATA wsaData;
SOCKET Socket;
SOCKADDR_IN SockAddr;
int lineCount = 0;
int rowCount = 0;
struct hostent *host;
string get_http;
get_http = "GET " + path + " HTTP/1.0\r\nHost: " + url + "\r\nUser-Agent: " + useragent + "\r\nConnection: close\r\n\r\n";
if (WSAStartup(MAKEWORD(2, 2), &wsaData) != 0) {
cout << "WSAStartup failed.\n";
system("pause");
//return 1;-
}
Socket = socket(AF_INET, SOCK_STREAM, IPPROTO_TCP);
host = gethostbyname(url.c_str());
SockAddr.sin_port = htons(44980);
SockAddr.sin_family = AF_INET;
SockAddr.sin_addr.s_addr = *((unsigned long*)host->h_addr);
if (connect(Socket, (SOCKADDR*)(&SockAddr), sizeof(SockAddr)) != 0) {
cout << "Could not connect";
system("pause");
//return 1;
}
send(Socket, get_http.c_str(), strlen(get_http.c_str()), 0);
int nDataLength;
while ((nDataLength = recv(Socket, buffer, 1000000, 0)) > 0) {
int i = 0;
while (buffer[i] >= 32 || buffer[i] == '\n' || buffer[i] == '\r') {
website_HTML += buffer[i];
i += 1;
}
}
closesocket(Socket);
WSACleanup();
return website_HTML;
}
The response length keeps changing although I return the same response every time server-side. The reason for the big buffer is that I thought that might be the problem since I am retrieving an entire files base64 encoded form.
Essentially, the problem is I am not getting the full/correct response.
while ((nDataLength = recv(Socket, buffer, 1000000, 0)) > 0) {
This reads from the socket. The number of bytes read goes into `nDataLength. Immediately afterwards:
int i = 0;
while (buffer[i] >= 32 || buffer[i] == '\n' || buffer[i] == '\r') {
website_HTML += buffer[i];
i += 1;
}
This logic completely ignores the byte count in nDataLength
, and just blindly reads the contents of the buffer, continuously, until the first control character that's not a newline or a carriage return.
Besides the fact that the response to your HTTP request can certainly contain binary characters, the response to your HTTP request will arrive in multiple packets, of varying sizes, which will be written on top of each other, consecutively, in buffer
. The allocated buffer
appears to be in static storage, so it will be zero-initialized, and it's rather unlikely that a single packet will be exceed 999999 bytes; so it's unlikely that the loop will run off the end of the buffer. It'll hit the \0
at some point.
But, since the response's packets will be of varying sizes, a shorter packet will have its contents replace the initial, longer contents of the preceding packet; but the broken logic will fail to detect it, and will swallow up the new packet followed by the trailing end of the previous packet.
Somewhat messy. Fix that up by using nDataLength
, to copy and append the contents of each packet to your string.