I have the following XML file:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE root [
<!ELEMENT root (entry*)>
<!ELEMENT entry (#PCDATA)>
<!ENTITY abc "a b c">
<!ENTITY xyz "x y z">
]>
<root>
<entry>&abc;</entry>
<entry>&xyz;</entry>
<entry>text</entry>
</root>
I use the following command to test my XPaths on it:
xmllint --xpath '...' test.xml
I am trying to match some custom entities with an XPath that looks like:
//entry[text() = '&abc;']
But it doesn't match anything. So I even tried:
//entry/text()
And the only result is text
from the last entry, nothing from the first two. If text()
doesn't return custom entities, is there anything else that does? Is there a way to match only entries containing &abc;
?
You cannot test against the &abc;
internal general entity reference because an XML parser must substitute an internal general entity reference with its replacement text (a b c
) when internal general entity references appear in an XML document's content.
You can see this in action by changing your XPath from
//entry[text() = '&abc;']
which selects nothing to
//entry[text() = 'a b c']
which selects the entry
element containing the replacement text.
The replacement text should be available as text nodes, so
//entry/text()
selects three text nodes:
a b c
x y z
text
To get this expected behavior from xmllint
, use the (oddly named) --noent
flag.