Search code examples
c#html.netxmlxmldocument

How can I read the HTML inside a XML tag?


I have a XML that return, at some point, this:

<TESTO>
    <img src="../path/image.jpg" alt="" />
</TESTO>

well, if I do:

string TESTO = m_oNode.SelectSingleNode("TESTO").InnerText;

TESTO will be "empty". Why? How can I read the whole text? With other tag without HTML tag all works perfectly...

I use XmlDocument

EDIT - code that create an Exception with InnerXml():

<TESTO>
    <table style="width: 100%;" border="0" cellspacing="0" cellpadding="0">
    <tbody>
    <tr>
    <td>&nbsp;</td>
    <td width="700"><a href="http://www.my.it/"><img src="/testata.jpg" alt="mycaf.it" width="700" height="333" border="0" /></a></td>
    <td>&nbsp;</td>
    </tr>
    <tr>
    <td>&nbsp;</td>
    <td style="text-align: center; background-color: #f5f5f5;" align="center" bgcolor="#f5f5f5"><br />
    <p style="color: #ee2e24; font-style: italic; font-size: 25px; font-family: Arial;">portale<br /> </p>
    </td>
    <td>&nbsp;</td>
    </tr>
    <tr>
    <td>&nbsp;</td>
    <td>&nbsp;</td>
    </tr>
    </tbody>
    </table>
</TESTO>

Solution

  • InnerText gets only the Text (for mixed content or text content). Use InnerXml instead.

    Example:

    <A>
        Some text in mixed content
        <B>OnlyText</B>
    </A
    

    Gives the result:

    • InnerText = "Some text in mixed content\r\nOnlyText"
    • InnerXml = "Some text in mixed content\r\n<B>OnlyText</B>";