Search code examples
internationalizationglobalizationdirectwrite

How do I specify directionality of ambiguous characters to IDWriteTextLayout?


Some characters have ambiguous directionality, like whitespace and punctuation marks. This can lead to text layout situations where there doesn't appear to be single correct layout without access to additional data to resolve the ambiguity. Consider this text:

\u05e9\u05e0\u05d1\u05d2abcd!

That's four Hebrew characters (unambiguously right-to-left), four English characters (unambiguously left-to-right), and one punctuation mark (ambiguous). If I layout that string in an IDWriteTextLayout with DWRITE_READING_DIRECTION_RIGHT_TO_LEFT, I get the following:

Rendered screen shot showing punctuation on far left.

The punctuation mark appears to be treated as a right-to-left character which is starting a new right-to-left block to the left of the English, which seems perfectly reasonable, especially considering that right-to-left was the specified reading direction. However, it's also entirely reasonable to expect the punctuation mark to be treated as a left-to-right character associated with the embedded left-to-right English text, which would mean it should appear to the right of the 'd'.

My app knows exactly how it wants this character should be treated. How do I pass that data to IDWriteTextLayout to resolve this ambiguity?

I found the SetLocaleName method and thought that it must be the answer, but I can't seem to get it to affect the result at all. I also found the localeName parameter when creating an IDWriteTextFormat (which is then used to create the IDWriteTextLayout).

If my goal is for this to generally be Hebrew text with a string of embedded US English, I would think I'd want to use locale he on the IDWriteTextFormat and then use SetLocaleName to override that with locale en-US on character range [4-9]. However, doing so has no effect. In fact, I can't get any combination of locales used in those places to have any effect on the layout at all, whether I restrict them to a subrange or apply them to the entire string.

Am I wrong in thinking that these APIs should serve this purpose? If so, what APIs should I be using? Or is there really no way to tell IDWriteTextLayout to resolve this ambiguity differently? Am I maybe using the APIs wrong? Here is the test code I'm using to create this IDWriteTextLayout:

TestTextRenderer::TestTextRenderer(const std::shared_ptr<DX::DeviceResources>& deviceResources) : 
    m_deviceResources(deviceResources),
    m_text(L"\u05e9\u05e0\u05d1\u05d2abcd!"),
    m_readingDirection(DWRITE_READING_DIRECTION_RIGHT_TO_LEFT),
    m_formatLocale(L"en-US"),
    m_layoutLocale(L"en-US")
{
    ComPtr<IDWriteTextFormat> textFormat;
    DX::ThrowIfFailed(
        m_deviceResources->GetDWriteFactory()->CreateTextFormat(
            L"Segoe UI",
            nullptr,
            DWRITE_FONT_WEIGHT_MEDIUM,
            DWRITE_FONT_STYLE_NORMAL,
            DWRITE_FONT_STRETCH_NORMAL,
            24.0f,
            m_formatLocale.c_str(),
            &textFormat
        )
    );
    DX::ThrowIfFailed(textFormat->SetReadingDirection(m_readingDirection));

    DX::ThrowIfFailed(
        m_deviceResources->GetDWriteFactory()->CreateTextLayout(
            m_text.c_str(),
            (uint32) m_text.length(),
            textFormat.Get(),
            250.0f,
            100.0f,
            &m_textLayout
        )
    );

    DWRITE_TEXT_RANGE all{0u, m_text.size()};
    DX::ThrowIfFailed(m_textLayout->SetLocaleName(m_layoutLocale.c_str(), all));

    DX::ThrowIfFailed(m_deviceResources->GetD2DFactory()->CreateDrawingStateBlock(&m_stateBlock));
    CreateDeviceDependentResources();
}

Solution

  • I don't think there's any ambiguity from the Unicode BiDi algorithm point of view. Initial direction set to IDWriteTextFormat or IDWriteTextLayout is crucial, but after that run directions will be derived strictly from codepoints.

    Setting locale won't change direction, but it will potentially affect shaping, end result depends on particular features run font has.

    I think you can accomplish abcd!... output using LRE/PDF controls around this part of the text.