Search code examples
cutf-8labelgtk3

GTK int to unicode char conversion for display in GTK label


I am receiving hex data from a serial port. I have converted the hex data to corresponding int value.

I want to display the equivalent character over GTK label. But if we see character map there are control characters from 0x00 to 0x20.

So i was thinking of adding 256 to the converted int value and show the corresponding Unicode character to label.

But i am not able to convert int to Unicode. say if i have an array of ints 266,267,289... how should i convert it to Unichar and display over GTK label.

I know it may seems very basic problem to you all but i have struggled a lot and didn't find any answer. Please help,


Solution

  • The GTK functions that set text on UI elements all assume UTF-8 strings. A single unsigned byte representing a Unicode code point with value > 127 will not form a valid UTF-8 string if written out as an unsigned byte. I can think of a couple of ways around this.

    1. Store the code point as a 32-bit integer (which is essentially UTF-32) and use the functions in the iconv library, or something similar, to do the conversion from UTF-32 to UTF-8. There are other conversion implementations in C widely available. Converting your unsigned byte to UTF-32 really amounts to padding it with three leading zero bytes -- which is easy to code.

    2. Generate a UTF-8 string yourself, based on the 8-bit code point value. Since you have a limited range of values, this is easy-ish. If you look at the way that UTF-8 is written out, e.g., here:

    https://en.wikipedia.org/wiki/UTF-8

    you'll see that the values you need to represent are written as two unsigned bytes, the first beginning with binary 110B, and the second with 10B. The bits of the code point value are split up and distributed between these two bytes. Doing this conversion will need a little masking and bit-shifting, but it's not hugely difficult.

    Having said all that, I have to wonder why you'd want to assign a character to a label that users will likely not understand? Why not just write the hex number on the label, if it is not a displayable character?