Search code examples
c#.netxmlescaping

Escape invalid XML characters in C#


I have a string that contains invalid XML characters. How can I escape (or remove) invalid XML characters before I parse the string?


Solution

  • As the way to remove invalid XML characters I suggest you to use XmlConvert.IsXmlChar method. It was added since .NET Framework 4 and is presented in Silverlight too. Here is the small sample:

    void Main() {
        string content = "\v\f\0";
        Console.WriteLine(IsValidXmlString(content)); // False
    
        content = RemoveInvalidXmlChars(content);
        Console.WriteLine(IsValidXmlString(content)); // True
    }
    
    static string RemoveInvalidXmlChars(string text) {
        var validXmlChars = text.Where(ch => XmlConvert.IsXmlChar(ch)).ToArray();
        return new string(validXmlChars);
    }
    
    static bool IsValidXmlString(string text) {
        try {
            XmlConvert.VerifyXmlChars(text);
            return true;
        } catch {
            return false;
        }
    }
    

    And as the way to escape invalid XML characters I suggest you to use XmlConvert.EncodeName method. Here is the small sample:

    void Main() {
        const string content = "\v\f\0";
        Console.WriteLine(IsValidXmlString(content)); // False
    
        string encoded = XmlConvert.EncodeName(content);
        Console.WriteLine(IsValidXmlString(encoded)); // True
    
        string decoded = XmlConvert.DecodeName(encoded);
        Console.WriteLine(content == decoded); // True
    }
    
    static bool IsValidXmlString(string text) {
        try {
            XmlConvert.VerifyXmlChars(text);
            return true;
        } catch {
            return false;
        }
    }
    

    Update: It should be mentioned that the encoding operation produces a string with a length which is greater or equal than a length of a source string. It might be important when you store a encoded string in a database in a string column with length limitation and validate source string length in your app to fit data column limitation.