Search code examples
c++stringzliburl-encodingurldecode

Difference in encoded and decoded string


I used a zlib compression to return a string.

void StatsClientImpl::sendToServer(std::stringstream &sstr) // to include an update interval version
{
    std::string error_msg = "";
    std::stringstream temp(AppState::getPid() + "," + AppState::getInstallOS());
    temp << sstr.str();
    std::string s = zlib_compress(temp.str());
.......

zlib_compress was as defined in : https://panthema.net/2007/0328-ZLibString.html

Then I did : std::cout << s.size() <<"\n";

The size of the string was shown to be 18.

Then I did:

CURL *handle = curl_easy_init();
char* o = curl_easy_escape(handle, s.data(), s.size());
        std::string bin(o);
        std::cout << o <<"\n";
        char* i= curl_easy_unescape(handle, bin.data(), bin.size(), NULL);
        std::string in(i);
        std::cout << i << in.size() <<"\n";

This gave me the following output:

x%DA%D3%D1%81%80%E2%92%D2%B44%1D%03%00%1BW%03%E5
x??с??Ҵ413
x%DA%D3%D1%81%80%E2%92%D2%B44%1D%03

I was passing this as the input string:

,,,,,,stuff,0

Why is there a difference in the decoded and encoded strings? How do I fix this?


Solution

  •     std::string in(i);
    

    The problem lies here; this std::string constructor expects a null-terminated string, so it truncates the data to the first 0 byte it finds (which is found early in the gzip output). You want to ask to curl_easy_unescape how long is the unescaped data and construct in accordingly:

        int sz=0;
        char* i= curl_easy_unescape(handle, bin.data(), bin.size(), &sz);
        std::string in(i, i+sz);
    

    But I have a question.Why doesn't this happen in bin(o) ?

    Because o points to an URL-encoded string, which doesn't include null bytes besides the terminator.