Search code examples
encodinghexprotocol-buffersprotobuf-c

Google protobuf adds weird 0xff values when decoding bytes which are larger or equal to 0x80 from message


I have this message file:

message samplemessage
{
    optional bytes byte_string = 1;
}

And this program which uses this protobuf file:

#include <iostream>
#include <fstream>
#include <string>
#include "mymessages.pb.h"
using namespace std;




// Main function:   Reads the entire packet from file,
//   modifies the string_data field and then writes the modified version back to disk.


void print_hex2(const char* string, int length) {
for (int i = 0; i<length;i++) {
    printf("%02x", (unsigned int) *string++);
}
printf("\n");
}


int main(int argc, char* argv[]) {
// Verify that the version of the library that we linked against is
// compatible with the version of the headers we compiled against.
GOOGLE_PROTOBUF_VERIFY_VERSION;

if (argc != 3) {
    cerr << "Usage: " << argv[0] << " PACKET_FILE OUTPUT_FILE" << endl;
    return -1;
}
samplemessage thing;

{
    // Read the existing address book.
    fstream input(argv[1], ios::in | ios::binary);
    if (!input) {
        cout << argv[1] << ": File not found.   Creating a new file." << endl;
    } else if (!thing.ParseFromIstream(&input)) {
        cerr << "Failed to parse packet file." << endl;
        return -1;
    }
}




print_hex2(thing.byte_string().c_str(), strlen(thing.byte_string().c_str()));

printf("%s\n", thing.byte_string().c_str());

unsigned char stuff[10000] = "\x41\x80";


thing.set_byte_string(reinterpret_cast<const unsigned char *>(stuff));


printf("%s\n", thing.byte_string().c_str());


print_hex2(thing.byte_string().c_str(), strlen(thing.byte_string().c_str()));

{
    // Write the new packet to disk.
    fstream output(argv[2], ios::out | ios::trunc | ios::binary);
    if (!thing.SerializeToOstream(&output)) {
        cerr << "Failed to write packet file." << endl;
        return -1;
    }
}

// Optional:    Delete all global objects allocated by libprotobuf.
google::protobuf::ShutdownProtobufLibrary();

return 0;
}

This program when compiled generates this output for me:

41ffffff80

But when i change the \x80 to \x7f the ff values do not appear and I get this output:

417f

Looking at the output file with xxd I do not see these ff bytes anywhere:

00000000: 0a02 4180                                ..A.

Why is this? Is this some encoding thing? I thought that the bytes in protobuf encode raw bytes but that is obviously not the case here? Why are the ff bytes being added?

Thanks in advance!


Solution

  • printf("%02x", (unsigned int) *string++);

    Apparently char is signed on your platform, which is quite common.

    Then when you cast (unsigned int)((char)0x80)), the 0x80 is evaluated as -128 and expanded to int first. This results as 0xffffff80

    Try this instead:

    printf("%02x", (unsigned int)(unsigned char) *string++);