Search code examples
cicu

How can I make an ICU data file with information for emojis only


I'm trying to pare down the libicudata.a file to the bare minimum that would still allow me to be able to test the following:

u_stringHasBinaryProperty(icu::UnicodeString::fromUTF8("🤙🏿").getBuffer(), -1, UProperty::UCHAR_RGI_EMOJI);

As per the instructions found here, I crafted the following file file and used it accordingly when configuring ICU's build.

{
  "strategy": "additive",
  "resourceFilters": [
    {
      "categories": [
        "misc"
      ],
      "files": {
        "includelist": [
          "emoji-sequences",
          "emoji-zwj-sequences"
        ]
      },
      "rules": []
    }
  ]
}

I did end up with a (much) smaller file (17kb) but it's obviously not working. M code compiles, links and runs but fails the test.


Solution

  • The good folks at icu-support@lists.sourceforge.net helped me out. The information was (temporarily) missing from their documentation. The following filter is what I was looking for:

    {
      "strategy": "additive",
      "featureFilters": {
        "uemoji": "include"
      }
    }