I am trying to encode a Unicode character in dart, but this results in an invalid byte array.
The character: 🔥
The bytes: [FF, FE, 3D, D8, 25, DD]
The string is encoded with BOM. After decoding this string I can see that the string is parsed correctly, resulting to see the emoji inside my IDE.
Then I try to encode the String again but that gives me a byte array, I don't understand:
[FF, FE, FD, FF, FD, FF]
I am using the package utf_convert to encode the string:
import 'package:utf_convert/utf_convert.dart' as utf;
List<int> convert(String input) {
return utf.encodeUtf16le(input, true).cast<int>();
}
Is this a bug inside this package, or am I overseeing something here?
I wrote some simple tests to capture the problem:
void main() {
var emojiString = '🔥';
var emojiBytes = <int>[0xFF, 0xFE, 0x3D, 0xD8, 0x25, 0xDD];
test('Decode Emoji', () {
var emoji = utf.decodeUtf16le(emojiBytes);
expect(emoji, emojiString);
});
test('Encode Emoji', () {
var bytes = utf.encodeUtf16le(emojiString, true).cast<int>();
expect(bytes, emojiBytes);
});
}
The function "Decode Emoji" succeeds, but the second one, "Encode Emoji" fails with the assertion:
Expected: [255, 254, 61, 216, 37, 221] Actual: [255, 254, 253, 255, 253, 255]
So after doing a lot of researching, I think this is a bug within this library. The code found there is a fork of a discontinued package found here.
The solution I did now, was using some other piece of code, still existing inside the dart library. I found a hint inside this SO post.
Then I implemented a new library on my own, which others facing the same issue can use too. I hosted it on GitHub and pub.dev under MIT license.