Search code examples
javaxmlparsingcomparisonxmlunit

Compare two documents where both parent elements and child elements are ordered diffently


I'm trying to unit test some methods that produce xml. I have an expected xml string and the result string and after googling and searching stack overflow, I found XMLUnit. However it doesn't seem to handle one particular case where repeating elements in different orders contain elements that are in different orders. For example:

Expected XML:

<graph>
  <parent>
    <foo>David</foo>
    <bar>Rosalyn</bar>
  </parent>
  <parent>
    <bar>Alexander</bar>
    <foo>Linda</foo>
  </parent>
</graph>

Actual XML:

<graph>
  <parent>
    <foo>Linda</foo>
    <bar>Alexander</bar>
  </parent>
  <parent>
    <bar>Rosalyn</bar>
    <foo>David</foo>
  </parent>
</graph>

You can see the parent node repeats and it's contents can be in any order. These two xml pieces should be equivalent but nothing from the stackoverflow examples I've seen does the trick with this. (Best way to compare 2 XML documents in Java) (How can I compare two similar XML files in XMLUnit)

I've resorted to creating Documents from the xml strings, stepping through each expected parent node and then comparing it to each actual parent node to see if one of them is equivalent.

It seems to me like a lot of reinventing of the wheel for something that should be a relatively common comparison. XMLUnit seems to do a lot, perhaps I've missed something but from what I can tell, it falls short in this particular case.

Is there an easier/better way to do this?

My Solution:

DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
dbf.setCoalescing(true);
dbf.setIgnoringElementContentWhitespace(true);
dbf.setIgnoringComments(true);
DocumentBuilder db = dbf.newDocumentBuilder();
// parse and normalize expected xml
Document expectedXMLDoc = db.parse(new ByteArrayInputStream(resultXML.getBytes()));
expectedXMLDoc.normalizeDocument();
// parse and normalize actual xml
Document actualXMLDoc = db.parse(new ByteArrayInputStream(actual.getXml().getBytes()));
actualXMLDoc.normalizeDocument();
// expected and actual parent nodes
NodeList expectedParentNodes = expectedXMLDoc.getLastChild().getChildNodes();
NodeList actualParentNodes = actualXMLDoc.getLastChild().getChildNodes();

// assert same amount of nodes in actual and expected
assertEquals("actual XML does not have expected amount of Parent nodes", expectedParentNodes.getLength(), actualParentNodes.getLength());

// loop through expected parent nodes
for(int i=0; i < expectedParentNodes.getLength(); i++) {
    // create doc from node
    Node expectedParentNode = expectedParentNodes.item(i);    
    Document expectedParentDoc = db.newDocument();
    Node importedExpectedNode = expectedParentDoc.importNode(expectedParentNode, true);
    expectedParentDoc.appendChild(importedExpectedNode);

    boolean hasSimilar = false;
    StringBuilder  messages = new StringBuilder();

    // for each expected parent, find a similar parent
    for(int j=0; j < actualParentNodes.getLength(); j++) {
        // create doc from node
        Node actualParentNode = actualParentNodes.item(j);
        Document actualParentDoc = db.newDocument();
        Node importedActualNode = actualParentDoc.importNode(actualParentNode, true);
        actualParentDoc.appendChild(importedActualNode);

        // XMLUnit Diff
        Diff diff = new Diff(expectedParentDoc, actualParentDoc);
        messages.append(diff.toString());
        boolean similar = diff.similar();
        if(similar) {
            hasSimilar = true;
        }
    }
    // assert it found a similar parent node
    assertTrue("expected and actual XML nodes are not equivalent " + messages, hasSimilar);        
}    

Solution

  • Just realized I hadn't selected an answer for this. I ended up using something very similar to my solution. Here's the final solution that worked for me. I've wrapped it up in a class to use with junit so the methods can be used like any other junit assertion.

    If all children need to be in order, as in my case you can run

    assertEquivalentXml(expectedXML, testXML, null, null);
    

    If some nodes are expected to have children in random order and/or some attributes need to be ignored:

    assertEquivalentXml(expectedXML, testXML,
                    new String[]{"dataset", "categories"}, new String[]{"color", "anchorBorderColor", "anchorBgColor"});
    

    Here's the class:

    /**
     * A set of methods that assert XML equivalence specifically for XmlProvider classes. Extends 
     * <code>junit.framework.Assert</code>, meaning that these methods are recognised as assertions by junit.
     *
     * @author munick
     */
    public class XmlProviderAssertions extends Assert {    
    
        /**
         * Asserts two xml strings are equivalent. Nodes are not expected to be in order. Order can be compared among the 
         * children of the top parent node by adding their names to nodesWithOrderedChildren 
         * (e.g. in <graph><dataset><set value="1"/><set value="2"/></dataset></graph> the top parent node is graph 
         * and we can expect the children of dataset to be in order by adding "dataset" to nodesWithOrderedChildren).
         * 
         * All attribute names and values are compared unless their name is in attributesToIgnore in which case only the 
         * name is compared and any difference in value is ignored.
         * 
         * @param expectedXML the expected xml string 
         * @param testXML the xml string being tested
         * @param nodesWithOrderedChildren names of nodes who's children should be in order
         * @param attributesToIgnore names of attributes who's values should be ignored
         */
        public static void assertEquivalentXml(String expectedXML, String testXML, String[] nodesWithOrderedChildren, String[] attributesToIgnore) {
            Set<String> setOfNodesWithOrderedChildren = new HashSet<String>();
            if(nodesWithOrderedChildren != null ) {
                Collections.addAll(setOfNodesWithOrderedChildren, nodesWithOrderedChildren);
            }
    
            DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
            dbf.setCoalescing(true);
            dbf.setIgnoringElementContentWhitespace(true);
            dbf.setIgnoringComments(true);
            DocumentBuilder db = null;
            try {
                db = dbf.newDocumentBuilder();
            } catch (ParserConfigurationException e) {
                fail("Error testing XML");
            }
    
            Document expectedXMLDoc = null;
            Document testXMLDoc = null;
            try {
                expectedXMLDoc = db.parse(new ByteArrayInputStream(expectedXML.getBytes()));
                expectedXMLDoc.normalizeDocument();
    
                testXMLDoc = db.parse(new ByteArrayInputStream(testXML.getBytes()));
                testXMLDoc.normalizeDocument();
            } catch (SAXException e) {
                fail("Could not parse testXML");
            } catch (IOException e) {
                fail("Could not read testXML");
            }
            NodeList expectedChildNodes = expectedXMLDoc.getLastChild().getChildNodes();
            NodeList testChildNodes = testXMLDoc.getLastChild().getChildNodes();
    
            assertEquals("Test XML does not have expected amount of child nodes", expectedChildNodes.getLength(), testChildNodes.getLength());
    
            //compare parent nodes        
            Document expectedDEDoc = getNodeAsDocument(expectedXMLDoc.getDocumentElement(), db, false);        
            Document testDEDoc = getNodeAsDocument(testXMLDoc.getDocumentElement(), db, false);
            Diff diff = new Diff(expectedDEDoc, testDEDoc);
            assertTrue("Test XML parent node doesn't match expected XML parent node. " + diff.toString(), diff.similar());
    
            // compare child nodes
            for(int i=0; i < expectedChildNodes.getLength(); i++) {
                // expected child node
                Node expectedChildNode = expectedChildNodes.item(i);
                // skip text nodes
                if( expectedChildNode.getNodeType() == Node.TEXT_NODE ) {
                    continue;
                }
                // convert to document to use in Diff
                Document expectedChildDoc = getNodeAsDocument(expectedChildNode, db, true);
    
                boolean hasSimilar = false;
                StringBuilder  messages = new StringBuilder();
    
                for(int j=0; j < testChildNodes.getLength(); j++) {
                    // find child node in test xml
                    Node testChildNode = testChildNodes.item(j);
                    // skip text nodes
                    if( testChildNode.getNodeType() == Node.TEXT_NODE ) {
                        continue;
                    }
                    // create doc from node
                    Document testChildDoc = getNodeAsDocument(testChildNode, db, true);
    
                    diff = new Diff(expectedChildDoc, testChildDoc);
                    // if it doesn't contain order specific nodes, then use the elem and attribute qualifier, otherwise use the default
                    if( !setOfNodesWithOrderedChildren.contains( expectedChildDoc.getDocumentElement().getNodeName() ) ) {
                        diff.overrideElementQualifier(new ElementNameAndAttributeQualifier());
                    }
                    if(attributesToIgnore != null) {
                        diff.overrideDifferenceListener(new IgnoreNamedAttributesDifferenceListener(attributesToIgnore));
                    }
                    messages.append(diff.toString());
                    boolean similar = diff.similar();
                    if(similar) {
                        hasSimilar = true;
                    }
                }
                assertTrue("Test XML does not match expected XML. " + messages, hasSimilar);
            }
        }
    
        private static Document getNodeAsDocument(Node node, DocumentBuilder db, boolean deep) {
            // create doc from node
            Document nodeDoc = db.newDocument();
            Node importedNode = nodeDoc.importNode(node, deep);
            nodeDoc.appendChild(importedNode);
            return nodeDoc;
        }
    
    }
    
    /**
     * Custom difference listener that ignores differences in attribute values for specified attribute names. Used to 
     * ignore color attribute differences in FusionChartXml equivalence.
     */
    class IgnoreNamedAttributesDifferenceListener implements DifferenceListener {
        Set<String> attributeBlackList;
    
        public IgnoreNamedAttributesDifferenceListener(String[] attributeNames) {        
            attributeBlackList = new HashSet<String>();
            Collections.addAll(attributeBlackList, attributeNames);
        }
    
        public int differenceFound(Difference difference) {
            int differenceId = difference.getId();
            if (differenceId == DifferenceConstants.ATTR_VALUE_ID) {
                if(attributeBlackList.contains(difference.getControlNodeDetail().getNode().getNodeName())) {
                    return DifferenceListener.RETURN_IGNORE_DIFFERENCE_NODES_IDENTICAL;
                }
            }
    
            return DifferenceListener.RETURN_ACCEPT_DIFFERENCE;
        }
    
        public void skippedComparison(Node node, Node node1) {
            // left empty
        }
    }