Search code examples
xmldelphidelphi-2007

How can I get the Open XML DOM parser to parse an ntEntityRef node?


I have this XML input:

<?xml version="1.0" encoding="utf-8"?>
<string>
&lt;N/A&gt;
</string>

Here is a short code sample to illustrate the problem:

uses
  xmldom, oxmldom, XMLDoc, XMLIntf;

procedure TForm1.Test;
var
  Document     : IXMLDocument;
  StringNode   : IXMLNode;
  LessThanNode : IXMLNode;
begin
  DefaultDOMVendor := 'Open XML';
  Document         := LoadXMLData(Memo1.Lines.Text);
  StringNode       := Document.DocumentElement;
  LessThanNode     := StringNode.ChildNodes.First;
  ShowMessage(LessThanNode.Text); // Displays '' (an emtpy string)
  ShowMessage(LessThanNode.XML);  // Displays '&lt;'
  ShowMessage(StringNode.Text);   // Causes an EXMLDocError, because the string node contains more than just a single node with NodeType = ntText
end;

How can I get the Open XML parser to transform the &lt;, &gt and similar XML entities to their real text (like < and >)?

I could write a workaround for the predefined entities in the XML specification: http://www.w3.org/TR/2008/REC-xml-20081126/#sec-predefined-ent

That won't help with additional entity nodes though ...

Related: Why doesn't IXMLNode.IsTextElement return True for CDATA elements?


Solution

  • Newer versions of Delphi don't ship the oxmldom unit anymore and newer versions of the so called ADOM are available:

    http://www.philo.de/xml/downloads.shtml

    So either using a different parser or upgrading OpenXML solves the problem.