Why does degree symbol differ from UTF-8 from Unicode?
According to http://www.utf8-chartable.de/ and http://www.fileformat.info/info/unicode/char/b0/index.htm, Unicode is B0, but UTF-8 is C2 B0 How come?
UTF-8 is a way to encode UTF characters using variable number of bytes (the number of bytes depends on the code point).
Code points between U+0080 and U+07FF use the following 2-byte encoding:
110xxxxx 10xxxxxx
where x
represent the bits of the code point being encoded.
Let's consider U+00B0. In binary, 0xB0 is 10110000. If one substitutes the bits into the above template, one gets:
11000010 10110000
In hex, this is 0xC2 0xB0.