Search code examples
javaxmlxpathnamespaces

Find XML node in d:-namespace in Java using XPath


I have an XML file of which I want to address a certain element in Java using an XPath. The problem is that the element is in the d:-namespace, and everything I have tried to add the namespace to the XPath according to the topics on this that I have found did not work. Is the d:-namespace a special namespace that follows different rules?

For reference, here is the XML I am trying to work with:

<?xml version="1.0" encoding="utf-8"?>
<feed xml:base="https://company.com/organisation/_api/"
    xmlns="http://www.w3.org/2005/Atom"
    xmlns:d="http://schemas.microsoft.com/ado/2007/08/dataservices"
    xmlns:m="http://schemas.microsoft.com/ado/2007/08/dataservices/metadata">
    <entry m:etag="&quot;3&quot;">
        <id>067d7924-2a19-4094-b588-347b0869a19c</id>
        <content type="application/xml">
            <m:properties>
                <d:Modified m:type="Edm.DateTime">2023-10-06T11:02:47Z</d:Modified>
            </m:properties>
        </content>
    </entry>
    <entry m:etag="&quot;6&quot;">
        <id>c0a9aca5-2a1e-41e5-9da8-95fcd46d3109</id>
        <content type="application/xml">
            <m:properties>
                <d:Modified m:type="Edm.DateTime">2023-10-16T06:46:11Z</d:Modified>
            </m:properties>
        </content>
    </entry>
</feed>

Effectively, I am first getting a list of both entries as XPathNodes via the XPath /feed/entry, then iterate over them and trying to get the Modified dates via the XPath //d:Modified. In theory, that should work, but in practice, it always returns me an empty string.

I have tried the following for adding the namespace to the XPath, but without any success so far:

Option A (the answer I found on other threads):

        XPathFactory xf = XPathFactory.newInstance();
        XPath xpath = XPathFactory.newInstance().newXPath();
        xpath.setNamespaceContext(new NamespaceContext() {
            @Override
            public String getNamespaceURI(String prefix) {
                if ("d".equals(prefix)) {
                    return "http://schemas.microsoft.com/ado/2007/08/dataservices";
                }
                return null; // Return null for other prefixes
            }

            @Override
            public String getPrefix(String namespaceURI) {
                throw new UnsupportedOperationException();
            }

            @Override
            public Iterator<String> getPrefixes(String namespaceURI) {
                throw new UnsupportedOperationException();
            }
        });

Option B (something I tried myself):

        XPathFactory xf = XPathFactory.newInstance();
        SimpleNamespaceContext namespaceContext = new SimpleNamespaceContext();
        namespaceContext.bindNamespaceUri("d", "http://schemas.microsoft.com/ado/2007/08/dataservices");
        xPath = xf.newXPath();
        xPath.setNamespaceContext(namespaceContext);

Option C (if I do it like that, the code I use to get the entries no longer works and the XPathNodes contains 0 entries)

        DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
        factory.setNamespaceAware(true);
        DocumentBuilder builder = factory.newDocumentBuilder();

I have also tried accessing it via the XPath //*[local-name()='Modified'], but the problem with that is that even if I am already in one specific entry, it still returns to me the Modified nodes of both entries (which confused me at first until I realized that all nodes apparently still contain the entire document tree). If I try to access something inside the nodes such as the id (via //id), it works nicely and returns just that one correct node. It just doesn't work with anything in that weird d:-namespace, and I have no idea why.

Can anyone tell me what I am doing wrong here?


EDIT:

To clarify, my aim is to find all entries, iterate over them and get their "Modified" dates so I can work with that. In essence, this is how I imagine that should look like, but I tried and it doesn't work that way, forcing me to use the workaround with the node ID and the namespace breaking.

Node getMostRecentNode(){
    Document document = getDocument();
    XPathNodes entries = evaluteXpath(itemsDocument, "/feed/entry", XPathNodes.class);
    for (int index = 1; index < entries.size(); index++) {
        Node entry = entries.get(index);
        String modifiedString = evaluteXpath(entry, "//d:Modified", String.class);
        [...logic for getting most recent node ...]
    }
}

Is this something that could work if I got the namespaces right? Or do I already have an error in my understanding of how XPaths work at this stage?


Solution

  • For the record, I have now found something that works. I don't think it's ideal, but it does get the job done.

    Basically, what I now do is rely on the fact that I can read the ID from a node once I have it, and then use that ID to build a complete XPath with the namespace-ignore-hack.

    The whole mess looks somewhat like this:

    public String getTargetNodeModified(XPathNodes entries) {
        Node targetEntry = getTargetNode(entries);
        String targetEntryId = evaluteXpath(latestEntry, "*", String.class);
        String searchString = String.format(
            "//entry[id='%s']//*[local-name()='Modified']",
            targetEntryId
        );
        return evaluteXpath(targetEntry, searchString, String.class);
    }
    
    public <T> T evaluteXpath(Object object, String xPathString, Class<T> type) {
        XPathExpression xPathExpression = xPath.compile(xPathString);
        return xPathExpression.evaluateExpression(object, type);
    }   
    

    Again, I find it very funky that I need to add the //entry[id='%s'] when I'm basing my search on the targetEntry, but apparently that's how it works.

    If anyone can think of a cleaner solution for this mess, please post it here.


    EDIT:

    Thanks to one of @Michael Kay 's comments below, I was now able to simplify it to this:

    public String getTargetNodeModified(XPathNodes entries) {
        Node targetEntry = getTargetNode(entries);
        return evaluteXpath(targetEntry, ".//*[local-name()='Modified']", String.class);
    }