Search code examples
unicodeclojureemoji

Using Emoji literals in Clojure source


On Linux with UTF-8 enabled console:

Clojure 1.6.0
user=> (def c \の)
#'user/c
user=> (str c)
"の"
user=> (def c \🍒)

RuntimeException Unsupported character: \🍒  clojure.lang.Util.runtimeException (Util.java:221)
RuntimeException Unmatched delimiter: )  clojure.lang.Util.runtimeException (Util.java:221)

I was hoping to have an emoji-rich Clojure application with little effort, but it appears I will be looking up and typing in emoji codes? Or am I missing something obvious here? 😞


Solution

  • Java represents Unicode characters in UTF-16. The emoji characters are "supplementary characters" and have a codepoint that cannot be represented in 16 bits.

    http://www.oracle.com/technetwork/articles/javase/supplementary-142654.html

    In essence, supplementary characters are represented not as chars but as ints and there are special apis for dealing with them.

    One way is with (Character/toChars 128516) - this returns a char array that you can convert to a string to print: (apply str (Character/toChars 128516)). Or you can create a String from an array of codepoint ints directly with (String. (int-array [128516]) 0 1). Depending on all the various things between Java/Clojure and your eyeballs, that may or may not do what you want.

    The format api supports supplementary characters so that may be easiest, however it takes an int so you'll need a cast: (format "Smile! %c" (int 128516)).