Search code examples
.netstringunicodewindows-store-appsunicode-normalization

How do I normalize a string?


In .NET you can normalize (NFC, NFD, NFKC, NFKD) strings with String.Normalize() and there is a Text.NormalizationForm enum.

In .NET for Windows Store Apps, both are not available. I have looked in the String class and in the System.Text and System.Globalization namespaces, but found nothing.

Have I missed something? How do I normalize strings in Windows Store Apps?

Does anyone have an idea why the Normalize method was not made available for Store Apps?


Solution

  • As you've pointed out, the Normalize method is not available on the String class on Windows store apps.

    However, this just calls the NormalizeString function in the Windows API.

    Even better, this function is in the approved list of Win32 and COM API functions usable in Windows Store apps.

    That said, you'd make the following declarations:

    public enum NORM_FORM 
    { 
      NormalizationOther  = 0,
      NormalizationC      = 0x1,
      NormalizationD      = 0x2,
      NormalizationKC     = 0x5,
      NormalizationKD     = 0x6
    };
    
    [DllImport("Normaliz.dll", CharSet = CharSet.Unicode, ExactSpelling = true,
        SetLastError = true)
    public static extern int NormalizeString(NORM_FORM NormForm,
        string lpSrcString,
        int cwSrcLength,
        StringBuilder lpDstString,
        int cwDstLength);
    

    You'd then call it like so:

    // The form.
    NORM_FORM form = ...;
    
    // String to normalize.
    string unnormalized = "...";
    
    // Get the buffer required.
    int bufferSize = 
        NormalizeString(form, unnormalized, unnormalized.Length, null, 0);
    
    // Allocate the buffer.
    var buffer = new StringBuilder(bufferSize);
    
    // Normalize.
    NormalizeString(form, unnormalized, unnormalized.Length, buffer, buffer.Length);
    
    // Check for and act on errors if you want.
    int error = Marshal.GetLastWin32Error();