I am transforming MIME messages to XML so that I can submit them to a mail merge service as SOAP requests, but Emoji are giving me problems (the smiley 😃 for example, which I'd like to have converted to 😃
).
I'm using XStream to handle my conversions but it doesn't properly encode emoji and other high/low surrogate pairs (see the example test case below). It is possible that I am missing some crucial xstream configuration component.
I have found this project that is based on this project which does conversions for specific Japanese cell phone providers via a hard-coded mapping, but I feel like this problem is probably solved more elegantly in existing Oracle or third-party (Apache, etc.) libraries.
From what I've read and heard NuSOAP addresses this issue for PHP but I'd like to stay in the Java/Groovy world for emoji conversion so I can use a compatible library.
What tools/approaches are you using to handle emoji conversion to XML on the JVM?
import junit.framework.TestCase;
import com.thoughtworks.xstream.XStream;
public class XStreamTest extends TestCase {
public void testXStreamEmojiEncoding() {
final String expected = "Open mouth smiley 😃 and two chicken heads followed by a period 🐔🐔.";
final String original = "Open mouth smiley 😃 and two chicken heads followed by a period 🐔🐔.";
final XStream xStream = new XStream();
final String returned = xStream.toXML(original);
assertEquals("<string>" + expected + "</string>", returned);
}
}
The above test looks for an HTML decimal representation of the emoji but I'll accept other formats that will work for MIME.
I recently wrote a library for this: emoji-java
Here is the kind of output you would get:
String str = "An 😀awesome 😃string with a few 😉emojis!";
String result = EmojiParser.parseToAliases(myString);
System.out.println(myString);
// Prints:
// "An 😀awesome 😃string with a few 😉emojis!"
You can either add the jar to your project or use the maven dependency:
<dependency>
<groupId>com.vdurmont<groupId>
<artifactId>emoji-java<artifactId>
<version>1.0.0</version> <!-- Or whatever the version will be when you read this post -->
</dependency>