I have an XML file of which I want to address a certain element in Java using an XPath. The problem is that the element is in the d:-namespace, and everything I have tried to add the namespace to the XPath according to the topics on this that I have found did not work. Is the d:-namespace a special namespace that follows different rules?
For reference, here is the XML I am trying to work with:
<?xml version="1.0" encoding="utf-8"?>
<feed xml:base="https://company.com/organisation/_api/"
xmlns="http://www.w3.org/2005/Atom"
xmlns:d="http://schemas.microsoft.com/ado/2007/08/dataservices"
xmlns:m="http://schemas.microsoft.com/ado/2007/08/dataservices/metadata">
<entry m:etag=""3"">
<id>067d7924-2a19-4094-b588-347b0869a19c</id>
<content type="application/xml">
<m:properties>
<d:Modified m:type="Edm.DateTime">2023-10-06T11:02:47Z</d:Modified>
</m:properties>
</content>
</entry>
<entry m:etag=""6"">
<id>c0a9aca5-2a1e-41e5-9da8-95fcd46d3109</id>
<content type="application/xml">
<m:properties>
<d:Modified m:type="Edm.DateTime">2023-10-16T06:46:11Z</d:Modified>
</m:properties>
</content>
</entry>
</feed>
Effectively, I am first getting a list of both entries as XPathNodes
via the XPath /feed/entry
, then iterate over them and trying to get the Modified dates via the XPath //d:Modified
. In theory, that should work, but in practice, it always returns me an empty string.
I have tried the following for adding the namespace to the XPath, but without any success so far:
Option A (the answer I found on other threads):
XPathFactory xf = XPathFactory.newInstance();
XPath xpath = XPathFactory.newInstance().newXPath();
xpath.setNamespaceContext(new NamespaceContext() {
@Override
public String getNamespaceURI(String prefix) {
if ("d".equals(prefix)) {
return "http://schemas.microsoft.com/ado/2007/08/dataservices";
}
return null; // Return null for other prefixes
}
@Override
public String getPrefix(String namespaceURI) {
throw new UnsupportedOperationException();
}
@Override
public Iterator<String> getPrefixes(String namespaceURI) {
throw new UnsupportedOperationException();
}
});
Option B (something I tried myself):
XPathFactory xf = XPathFactory.newInstance();
SimpleNamespaceContext namespaceContext = new SimpleNamespaceContext();
namespaceContext.bindNamespaceUri("d", "http://schemas.microsoft.com/ado/2007/08/dataservices");
xPath = xf.newXPath();
xPath.setNamespaceContext(namespaceContext);
Option C (if I do it like that, the code I use to get the entries no longer works and the XPathNodes contains 0 entries)
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
factory.setNamespaceAware(true);
DocumentBuilder builder = factory.newDocumentBuilder();
I have also tried accessing it via the XPath //*[local-name()='Modified']
, but the problem with that is that even if I am already in one specific entry, it still returns to me the Modified nodes of both entries (which confused me at first until I realized that all nodes apparently still contain the entire document tree). If I try to access something inside the nodes such as the id (via //id
), it works nicely and returns just that one correct node. It just doesn't work with anything in that weird d:-namespace, and I have no idea why.
Can anyone tell me what I am doing wrong here?
EDIT:
To clarify, my aim is to find all entries, iterate over them and get their "Modified" dates so I can work with that. In essence, this is how I imagine that should look like, but I tried and it doesn't work that way, forcing me to use the workaround with the node ID and the namespace breaking.
Node getMostRecentNode(){
Document document = getDocument();
XPathNodes entries = evaluteXpath(itemsDocument, "/feed/entry", XPathNodes.class);
for (int index = 1; index < entries.size(); index++) {
Node entry = entries.get(index);
String modifiedString = evaluteXpath(entry, "//d:Modified", String.class);
[...logic for getting most recent node ...]
}
}
Is this something that could work if I got the namespaces right? Or do I already have an error in my understanding of how XPaths work at this stage?
For the record, I have now found something that works. I don't think it's ideal, but it does get the job done.
Basically, what I now do is rely on the fact that I can read the ID from a node once I have it, and then use that ID to build a complete XPath with the namespace-ignore-hack.
The whole mess looks somewhat like this:
public String getTargetNodeModified(XPathNodes entries) {
Node targetEntry = getTargetNode(entries);
String targetEntryId = evaluteXpath(latestEntry, "*", String.class);
String searchString = String.format(
"//entry[id='%s']//*[local-name()='Modified']",
targetEntryId
);
return evaluteXpath(targetEntry, searchString, String.class);
}
public <T> T evaluteXpath(Object object, String xPathString, Class<T> type) {
XPathExpression xPathExpression = xPath.compile(xPathString);
return xPathExpression.evaluateExpression(object, type);
}
Again, I find it very funky that I need to add the //entry[id='%s']
when I'm basing my search on the targetEntry
, but apparently that's how it works.
If anyone can think of a cleaner solution for this mess, please post it here.
EDIT:
Thanks to one of @Michael Kay 's comments below, I was now able to simplify it to this:
public String getTargetNodeModified(XPathNodes entries) {
Node targetEntry = getTargetNode(entries);
return evaluteXpath(targetEntry, ".//*[local-name()='Modified']", String.class);
}