Search code examples
c#windowswinformsunicodeglyph

Unicode glyphs not combined properly on Windows Forms


Problem: Javanese Script (using Google's Noto Sans Javanese font) rendered and "combined" properly on HTML, but not on Windows Forms Application (C# .NET, Visual Studio 2017).

Edit: My computer uses Windows 7, 64-bit.

Noto Sans Javanese direct download link (.zip)


Glyphs Used

There are many cases to show that the glyphs are not combined properly, but here's one example I used:

  1. JAVANESE LETTER NA, U+A9A4, #43428;
  2. JAVANESE PANGKON, U+A9C0, #43456;
  3. JAVANESE LETTER TA, U+A9A0, #43424;
  4. JAVANESE VOWEL SIGN PEPET, U+A9BC, #43452;

Javanese Script Unicode Specification direct download link (.pdf)


Correct/desired behaviour

Four of these glyphs should be "combined", becoming one character

HTML code:

<html>
<head>
    <meta charset="utf-8">
    <style>
    .javanese {
    font-family: "Noto Sans Javanese";
    font-size: 66px;
    }
    </style>
</head>
<body>
    <div class="javanese">ꦤ꧀ꦠꦼ</div>
    <div class="javanese">&#43428;&#43456;&#43424;&#43452;</div>
</body>
</html>

HTML result:

Correct rendering of Javanese Script on HTML


Incorrect rendering on Windows Forms (C# .NET)

I am using Visual Studio 2017 Community, creating a Windows Forms Desktop Application.
Label components are using the "Noto Sans Javenese" font.

C# code:

namespace WindowsFormsApp1
{
    public partial class Form1 : Form
    {
        public Form1()
        {
            InitializeComponent();

            this.label1.Text = "\uA9A4\uA9C0\uA9A0\uA9BC";
            this.label2.Text = "ꦤ꧀ꦠꦼ"; // Copied from HTML

            // this one is rendered correctly
            // Thai character "ko kai" (U+0E01) and combining characters "mai tho" (U+0E49).
            this.label3.Text = "\u0E01\u0E49\u0E49\u0E49\u0E49\u0E49\u0E49\u0E49\u0E49";
        }
    }
}

C# result:

Incorrect rendering of Javanese script on Windows Form Application


Questions

  1. What is the reason of this behavior? Can someone explain?
  2. What should I do to make the Javanese script "combined" correctly on Windows Form Application?

Thank you very much!


Solution

  • I have gotten few options and insights to help resolve this problem. Apparently, this is only a problem with Windows Forms on Windows 7.

    So far, my options are:

    1. Switch to WPF Application (best option, in my opinion)
    2. Use WPF Composite Control inside Windows Forms
    3. Render as bitmap using third-party libraries, for example: HarfBuzz

    Here are the sources, credits goes to all respective authors:


    1. cheong00 posted a great explanation on a MSDN Thread:

    Since Win7 has Unicode 5.1 support only and the character \uA9A4 falls in Unicode 5.2 range, the GDI+ text rendering function may not be able to handle the glyph hints correctly. (I'm not expert on i18n issues, so don't know whether special glyph hint support is needed)

    Since IE and other web browsers such as Chrome and Firefox all comes with their own font rendering engine, they're not subject to GDI+ rendering limitations.

    Btw, also tested setting "UseCompatibleTextRendering" to true does not help either.

    On the other hand, WPF forms does render the text correctly. So consider changing it to WPF application, or replace necessary controls with WinForm hosted WPF controls.


    1. u/GoogleBingLady posted an insight on a Reddit thread:

    Font shaping (Bidirectionality, Context-based shaping, ligatures, positioning & reordering) is a very, very complex topic. Although I'd be surprised, it may be that Windows Forms do not support font shaping.

    A workaround would be to use a library like HarfBuzz, render the result to a bitmap and then display that bitmap. See http://behdad.org/text/ for details.

    In fact, your problem is described here on page 8: http://www.panl10n.net/Presentations/Cambodia/Pema/LocalizationofLinux(Bhutan).pdf