Search code examples
objective-cunicodeios6nsstring

iOS CFStringTransform and Đ


I'm working on an iOS app in which I have to list and sort people names. I've some problem with special character.

I need some clarification on Martin R answer on https://stackoverflow.com/a/15154823/2148377

You could use the CoreFoundation CFStringTransform function which does almost all transformations from your list. Only "đ" and "Đ" have to be handled separately:

Why this particular letter? Where does this come from? Where can I find the documentation?

Thanks a lot.


Solution

  • I am not 100% sure, but I think it can be seen from the Unicode Data Base http://www.unicode.org/Public/6.2.0/ucd/UnicodeData.txt.

    For example, the entry for "à" is

    00E0;LATIN SMALL LETTER A WITH GRAVE;Ll;0;L;0061 0300;;;;N;LATIN SMALL LETTER A GRAVE;;00C0;;00C0
    

    where field #6 is the "Decomposition mapping" into "a" + U+0300 (COMBINING GRAVE ACCENT), therefore

    CFStringTransform(..., kCFStringTransformStripCombiningMarks, ...)
    

    transforms "à" into "a".

    The entries for "Đ" and "đ" are

    0110;LATIN CAPITAL LETTER D WITH STROKE;Lu;0;L;;;;;N;LATIN CAPITAL LETTER D BAR;;;0111;
    0111;LATIN SMALL LETTER D WITH STROKE;Ll;0;L;;;;;N;LATIN SMALL LETTER D BAR;;0110;;0110
    

    where field #6 is empty, so these characters do not have a decomposition into a "base character" and a "combining mark".

    So the question remains: Which standard determines that a "normalized form" of "đ / Đ" is "d / D"?