Search code examples
javaxmlxpathfunctional-testing

Java: Compare XPath structure


I am writing some functional tests that should compare the XML structure of two XML documents. This means that the tag order and naming is improtant, while the tag content is irrelevant.

For example, the following calls:

Call 1:

<?xml version="1.0" encoding="UTF-8"?>
<bookstore>
  <book category="COOKING">
    <title lang="en">Everyday Italian</title>
    <author>Giada De Laurentiis</author>
  </book>
</bookstore>

Call 2:

<?xml version="1.0" encoding="UTF-8"?>
<bookstore>
  <book category="CHILDREN">
    <title lang="en">Harry Potter</title>
    <author>J K. Rowling</author>
  </book>
</bookstore>

Have the same tag structure, but:

Call 3:

<?xml version="1.0" encoding="UTF-8"?>
<bookstore>
  <book category="WEB">
    <title lang="en">Learning XML</title>
    <author>Erik T. Ray</author>
    <year>2003</year>
  </book>
</bookstore>

Is different, because it has a <year> tag after <author>, and calls 1 and 2 lack that tag.

What's the Java way to compare XML Structures?


Solution

  • I've written a class that flattens the XML and provides a human-readable String for comparison.

    For my comparison purpose, I create two XPathFlattener objects, and compare their toString() representation.

    import java.util.ArrayList;
    import java.util.List;
    
    import org.apache.commons.lang.StringUtils;
    import org.w3c.dom.Node;
    
    public class XPathFlattener {
    
        private Node root;
    
        public XPathFlattener(Node root) {
            this.root = root;
        }
    
        /**
         * Flattens a XPath tree to a list of nodes, in pre-order traversal.
         */
        public List<Node> flatten() {
            List<Node> nodes = flattenTreeToList(this.root, new ArrayList<Node>());
            return nodes; 
        }
    
        /**
         * Flattens a XPath tree to a list of Strings, each representing the name
         * of the node, but not its contents. 
         * The list is created using pre-order traversal.
         */
        @Override
        public String toString() {
            List<String> nodesStrings = new ArrayList<>();
            for (Node n:this.flatten()) {
                nodesStrings.add(stringRepresentation(n));
            }
            return StringUtils.join(nodesStrings, ", ");
        }
    
        /**
         * Recursively flattens a Node tree to a list, in pre-order traversal.
         * @param node
         * @param nodes
         * @return
         */
        private static List<Node> flattenTreeToList(Node node, List<Node> nodes) {
            nodes.add(node);
            for (int i=0; i< node.getChildNodes().getLength(); i++) {
                Node childNode = node.getChildNodes().item(i);
                flattenTreeToList(childNode, nodes);
            }
            return nodes;
        }
    
        /**
         * A String representation of the node structure, without its contents.
         * @param node
         * @return
         */
        private static String stringRepresentation(Node node) {
            return String.format("[%s, (type %d)]", node.getNodeName(), node.getNodeType());
        }
    
    }