Search code examples
xmlvalidationrssxml-serializationfeed

Are There Other Characters Like Ampersand in XML Format Which Need To be Converted


While I was writing a custom RSS feed for my PHP program, I've come across an issue that the ampersand (&) character has to be converted to &. I'm wondering if there are other characters like this. Thanks for your information.

This is invalid:

<?xml version="1.0" encoding="UTF-8" ?>         
<rss version="2.0">
<channel>
    <title>custom user feed</title>                 
        <item>
            <description>
                <div>a & b</div>
            </description>
        </item>
</channel>      
</rss>

Reference: Why can't RSS handle the ampersand?


Solution

  • Yes, at a bare minimum, it should be obvious that < will cause you issues, since it would be taken as a tag start. It is usually encoded as &lt;.

    See http://en.wikipedia.org/wiki/XML#Escaping for more detail.