Search code examples
xmlxpathlibxml2

How to replace character entities in XPath?


Using libxml2, is there a way to replace the standard character entities (<, >, ', ", &) with their corresponding characters in an XPath expression?

For example, I would like to make it so that:

  • math[contains(., '1 &lt; 2') searches for content that contains 1 < 2
  • item[@id="3&quot; bracket"] searches for attributes with value 3" bracket

libxml2 conveniently replaces these entities by default when parsing XML, but not when evaluating XPath expressions. Is there a way to enable replacement for XPath?

(Edit: I don't see anything about XPath in libxml2's documentation about entities.)

(Edit: If there's not a way to do this, then I would accept as an answer an explanation of why not. libxml2 seems quite thorough in its implementation of XML, so I suspect that if this is not supported, either it was an intentional choice or my proposal is somehow problematic.)


Solution

  • No, and the reason is that XPath is lexically defined independent of XML. XPath itself can contain literal < and & characters without conflict with their special purpose in XML, so XPath does not need character entities.

    When XPaths are used within XML (for example, in XSLT), it is XML parsing that requires that < and & be escaped, and it is the XML parser that provides the application (including the XPath library) with an string value for the XPath expression in which character entities are expanded to their literal definitions.