I have an XML file that I need to navigate and it's something like this (full XML is here):
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<md:EntitiesDescriptor xmlns:md="urn:oasis:names:tc:SAML:2.0:metadata">
<md:EntityDescriptor xmlns:saml2="urn:oasis:names:tc:SAML:2.0:assertion" xmlns:saml2p="urn:oasis:names:tc:SAML:2.0:protocol" xmlns:xsi="http://www.w3.org/2001/XMLSchemainstance" ID="_id-83bbfdd3-e4c4-42cf-a024-e4733569a4ae" entityID="https://id.eht.eu">
<md:Organization>
<md:OrganizationName xml:lang="it">EtnaHitech</md:OrganizationName>
<md:OrganizationName xml:lang="en">EtnaHitech</md:OrganizationName>
<md:OrganizationDisplayName xml:lang="it">EHT</md:OrganizationDisplayName>
<md:OrganizationDisplayName xml:lang="en">EHT</md:OrganizationDisplayName>
</md:Organization>
</md:EntityDescriptor>
<md:EntityDescriptor xmlns:saml2="urn:oasis:names:tc:SAML:2.0:assertion" xmlns:xs="http://www.w3.org/2001/XMLSchemainstance" ID="_gh3s48d19e23e85be40k4ab5ey331e7k4f04f73fb5" entityID="https://id.lepida.it/idp/shibboleth">
<md:Organization>
<md:OrganizationName xml:lang="it">Lepida</md:OrganizationName>
<md:OrganizationDisplayName xml:lang="it">Lepida</md:OrganizationDisplayName>
</md:Organization>
</md:EntityDescriptor>
</md:EntitiesDescriptor>
Let's say I want to get the node/element md:EntityDescriptor
with a specific value of the attribute entityID
, like for example entityID="https://id.eht.eu"
I tried to use XPath with Java, and this is my code:
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
Document doc = builder.parse("C:/outputTest/input.xml");
XPathFactory xPathfactory = XPathFactory.newInstance();
XPath xpath = xPathfactory.newXPath();
XPathExpression expr = xpath.compile("//md:EntityDescriptor[@entityID='https://id.eht.eu']");
NodeList nl = (NodeList) expr.evaluate(doc, XPathConstants.NODESET);
And when I try to cycle on the result:
if (nl != null && nl.getLength() != 0) {
for (int i = 0; i < nl.getLength(); i++) {
System.out.println(nl.item(i).getNodeValue());
}
}
my NodeList
is always empty. I can't get this thing to work. I expect it to at least get the specified node in my NodeList
and then I'll try to remove it entirely from the document. In general, I need to make something that will get any node md:EntityDescriptor
with a specified value of entityID
and then remove it from the document.
Your XML is using namespaces. Querying such documents with XPath is slightly different - XPath with namespace in Java. Using the knowledge from the linked question, the simplest way to adapt your code would be to edit your XPath like this:
//*[local-name()='EntityDescriptor'][@entityID='https://id.eht.eu']
Next, you said you wanted to remove (Removing nodes from an XmlDocument) elements that are the result of your search query. You could do so by iterating your nl
one Node
at a time, refer to its parent, and have it remove the reference to that node. Adapting your own code, this process could look like this:
for (int i = 0; i < nl.getLength(); i++) {
Node elem = nl.item(i);
// Debug output:
// System.out.println(elem.getTextContent());
elem.getParentNode().removeChild(elem);
}
Finally, you probably want to store (How to update XML using XPath and Java) your modified document. You can do something like this:
Transformer xformer = TransformerFactory.newInstance().newTransformer();
xformer.transform(new DOMSource(doc), new StreamResult(new File("C:/outputTest/output.xml")));