I was CBOR serializing a JSON object in C++ with nlohmann::json library and my use case involves reading the cbor byte string output in c#. I've noticed that, whereas when dumping a json object to a string in C++ with nlohmann::json library, json string values (i.e., case value_t::string) are escaped (a call to escape_string is made), no such call is made when json values are string values in the CBOR approach.
I was reading the CBOR CRF 7049 and it seems that strings do not need to be escaped when serializing to CBOR. The behavior in the nlohmann::json library is consistent: strings are not escaped when serializing, nor excepted to be escaped when de-serializing. But it appears that Newtonsoft.Json (C# library), expects that. Is it a valid expectation? Or am I doing something wrong in the process?
C++ side:
nlohmann::json json_doc;
json_doc["characters"] = nlohmann::json::array();
for (int i = 0; i < characters.size(); i++) {
json_doc["characters"][i]["name"] = (characters[i] != nullptr) ? characters[i]->name() : "";
}
std::vector<uint8_t> cbor = nlohmann::json::to_cbor(json_doc);
output->assign((char*)&cbor[0], cbor.size());
C# side. cbor_bytes is the cbor byte string (c++ output vector)
CBORObject cbor = CBORObject.DecodeFromBytes(cbor_bytes);
output = cbor.ToString();
Such output string by then, is wrongly formed:
{"characters": [{"name": "Clara Oswald"}, {"name": "Kensi Blye"}, {"name": "Temperance "Bones" Brennan"}]}
and cannot, obviously be parsed:
JObject output_obj = JObject.Parse(output);
CBOR (Concise Binary Object Representation) is not JSON (JavaScript Object Notation). Although CBOR may have borrowed some concepts from JSON, it is clearly a different format with different rules and goals. CBOR is a binary format; JSON is text. In CBOR, strings have length prefixes, whereas they do not in JSON. Furthermore, CBOR does not allow arbitrary whitespace between elements (it wouldn't make sense for a binary format), whereas JSON does (for human readability). Ultimately, CBOR does not need a mechanism to escape strings because it does not require delimiters to tell where a string starts and ends. JSON, on the other hand, requires double quotes to mark the beginning and end of each string. As a consequence, quotes and control characters within strings must be escaped with backslashes in JSON, as well as literal backslashes themselves. There is no getting around this rule if you want to ensure the JSON will be parsable.
In your code above you are using the CBORObject.ToString()
method to turn the object into a string. If this CBORObject
is from a third-party library, does the documentation state that ToString()
will produce valid JSON? If so, then it definitely has a bug; it should be doing the proper escaping as required by the JSON spec. If there is no such promise of valid JSON, then you can't expect that Json.Net will be able to parse the string, even if it sort of looks like JSON. (You might check to see whether the CBORObject
has some other dedicated method like ToJson()
for performing this conversion.) If CBORObject
is your own code, then it is on you to escape the strings properly when converting from CBOR to JSON.