Search code examples
c#localizationasciiextended-ascii

Convert two ascii characters to their 'corresponding' one character extended ascii representation


The problem: I have two fixed width strings from an external system. The first contains the base characters (like a-z), the second (MAY) contain diacritics to be appended to the first string to create the actual characters.

string asciibase = "Dutch has funny chars: a,e,u";
string diacrits  = "                       ' \" \"";

//no clue what to do

string result = "Dutch has funny chars: á,ë,ü";

I could write a massive search and replace for all characters + different diacritics but was hoping for something a bit more elegant.

Somebody have a clue how to fix this one? Tried it with calculating the decimal values, using string.Normalize (c#) but no results. Also Google didn't really turn up with something.


Solution

  • I cannot find an easy solution except using lookup tables:

    public void TestMethod1()
    {
        string asciibase = "Dutch has funny chars: a,e,u";
        string diacrits = "                       ' \" \"";
        var merged = DiacritMerger.Merge(asciibase, diacrits);
    }
    

    [EDIT: Simplified code after suggestions in the answers from @JonB and @Oliver]

    public class DiacritMerger
    {
        static readonly Dictionary<char, char> _lookup = new Dictionary<char, char>
                             {
                                 {'\'', '\u0301'},
                                 {'"', '\u0308'}
                             };
    
        public static string Merge(string asciiBase, string diacrits)
        {
            var combined = asciiBase.Zip(diacrits, (ascii, diacrit) => DiacritVersion(diacrit, ascii));
            return new string(combined.ToArray());
        }
    
        private static char DiacritVersion(char diacrit, char character)
        {
            char combine;
            return _lookup.TryGetValue(diacrit, out combine) ? new string(new [] {character, combine}).Normalize()[0] : character;
        }
    }