I have a pipe with an endless amount of strings being written to it. These strings are a mix of ASCII and Emojis. The problem I am having is I am reading them like this
char msg[100];
int length = read(fd,&msg,99);
msg[length] =0;
But sometimes the emoji I'm guessing is multibyte and it is getting cut in half and then when I print to the screen I get the diamond question mark unknown UTF-8 symbol.
If anyone knows how to prevent this please fill me in; I've been searching for a while now.
If you're reading chunks of bytes, and want to output chunks of UTF-8, you'll have to do at least some minimal UTF-8 decoding yourself. The simplest condition to check for is look at each byte (let's call it b) and see if it is a continuation byte:
bool is_cont = (0x80 == (0xC0 & b));
Any byte that is not a continuation starts a sequence, which continues until the next non-continuation byte. You'll need a 4-byte buffer to hold the chunks.