Search code examples
c#stringutf-8control-characters

Removing control characters from a UTF-8 string


I found this question but it removes all valid utf-8 characters also (returns me a blank string, while there are valid utf-8 characters plus control characters). As I read about utf-8, there's not a specific range for control characters and each character set has its own control characters.

How can I modify above solution to only remove control characters ?


Solution

  • I think the following code will work for you:

    public static string RemoveControlCharacters(string inString)
    {
        if (inString == null) return null;
        StringBuilder newString = new StringBuilder();
        char ch;
        for (int i = 0; i < inString.Length; i++)
        {
            ch = inString[i];
            if (!char.IsControl(ch))
            {
                newString.Append(ch);
            }
        }
        return newString.ToString();
    }