What's the standard way of serializing a utf-8 string in JSON? Should it be with u escaped sequence or should it be the hex code.
I want to serialize some sensor readings with units in a JSON Format.
For example I have temperature readings with units °C. Should it be serialized as
{
"units": "\u00b0"
}
´´´
or should it be something like
´´´
{
"units":"c2b0"
}
Or could both of these supported by the standard.
If JSON is used to exchange data, it must use UTF-8 encoding (see RFC8259). UTF-16 and UTF-32 encodings are no longer allowed. So it is not necessary to escape the degree character. And I strongly recommend against escaping unnecessarily.
Correct and recommended
{
"units": "°C"
}
Of course, you must apply a proper UTF-8 encoding.
If JSON is used in a closed ecosystem, you can use other text encodings (though I would recommend against it unless you have a very good reason). If you need to escape the degree character in your non-UTF-8 encoding, the correct escaping sequence is \u00b0
.
Possible but not recommended
{
"units": "\u00b0C"
}
Your second approach is incorrect under all circumstances.
Incorrect
{
"units":"c2b0"
}
It is also incorrect to use something like "\xc2\xb0". This is the escaping used in C/C++ source code. It also used by debugger to display strings. In JSON, it always invalid.
Incorrect as well
{
"units":"\xc2\xb0"
}