Search code examples
javaarraysxpathxpath-2.0xpath-1.0

Returning Null values for missing tags with EvaluateXpath in Java


I am retrieving XML tag values by evaluate path, assume I have 3 tags with book info: Book info consists of : Name - year - Author and in 1 tag Author name has been missed and doesn't come in my tags, I want to have an array that shows 1 author name + null value which shows that 2 tags hasn't been specified, like below: As you see second tag consist of no author name and the third tag doesn't have any author name tag as below: I really appreciate for any guidance/ hint/ help. :-)

Author: [John Smith,null,null]

My XML File:

<?xml version="1.0" encoding="UTF-8"?>
<perldata>
    <item key="book">
        <item key="name">My Book Name</item>
        <item key="year">2019</item>
        <item key="author">John Smith</item>
    </item>
    <item>
        <item key="name">Anonymous Book Name 1</item>
        <item key="year">2018</item>
        <item key="author"></item>
    </item>
    <item>
        <item key="name">Her Book Name</item>
        <item key="year">2018</item>
    </item>
</perldata>

This shows that the third tag does not consist of Author name tag. I don't know how to show null value in EvaluateXpath: ( Really need help)

         String fileName="book.xml";
         Document document = getDocument(fileName);

                     // Defining Variables
                     //   String xpathExpression = "";
                       FileWriter fw = null; 
                       BufferedWriter bw = null; 
                       PrintWriter pw = null;

                    //Using Document Builder
         DocumentBuilderFactory documentBuilderFactory = DocumentBuilderFactory.newInstance();
                            documentBuilderFactory.setNamespaceAware(true);
                            DocumentBuilder documentBuilder = documentBuilderFactory.newDocumentBuilder();
                            Document doc1 = documentBuilder.parse(fileName);


                         /*******Get attribute values using xpath******/
                        XPathFactory xpathFactory = XPathFactory.newInstance();
                        XPath xpath = xpathFactory.newXPath();
                try{
                        fw = new FileWriter("/root/Desktop/book.txt");
                        bw = new BufferedWriter(fw);
                        pw = new PrintWriter(bw)
                        pw.println("BookName: "+evaluateXpath(document, "/perldata/item[@key=book]/item[@key='name']/text()"));
                        pw.println("year: "+evaluateXpath(document, "/perldata/item[@key=book]/item[@key='year']/text()"))
                        pw.println("Author: "+evaluateXpath(document, "/perldata/item[@key=book]/item[@key='author']/text()"))
                pw.flush(); }
catch (IOException e) 
        { e.printStackTrace(); } } }

        private static List<String> evaluateXPath(Document document, String xpathExpression) throws Exception 
        {
            // Create XPathFactory object
            XPathFactory xpathFactory = XPathFactory.newInstance();

            // Create XPath object
            XPath xpath = xpathFactory.newXPath();

            List<String> values = new ArrayList<>();
            try
            {
                // Create XPathExpression object
                XPathExpression expr = xpath.compile(xpathExpression);

                // Evaluate expression result on XML document
                NodeList nodes = (NodeList) expr.evaluate(document, XPathConstants.NODESET);

                for (int i = 0; i < nodes.getLength(); i++) {
                    values.add(nodes.item(i).getNodeValue());
                }

            } catch (XPathExpressionException e) {
                e.printStackTrace();
            }

            return values;
        }


        private static Document getDocument(String fileName) throws Exception 
        {
            DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
            factory.setNamespaceAware(true);
            DocumentBuilder builder = factory.newDocumentBuilder();
            Document doc = builder.parse(fileName);
            return doc;
        }

        }

Solution

  • Summary:

    Trim the tag's text content, and check if the resulting string is empty or not.

    Details:

    The XML in the question only has one tag containing key="book". I assume all 3 sections should have this, so we know each one represents a book.

    Therefore, I will assume you have an XML file like the following, which includes one empty "author" tag, and one completely missing "author" tag:

    <?xml version="1.0" encoding="UTF-8"?>
    <perldata>
        <item key="book">
            <item key="name">My Book Name</item>
            <item key="year">2019</item>
            <item key="author">John Smith</item>
        </item>
        <item key="book">
            <item key="name">Anonymous Book Name 1</item>
            <item key="year">2018</item>
            <item key="author"></item>
        </item>
        <item key="book">
            <item key="name">Her Book Name</item>
            <item key="year">2018</item>
        </item>
        <item key="book">
            <item key="name">Another Book Name</item>
            <item key="year">2019</item>
            <item key="author">Jane Jones</item>
        </item>
    </perldata>
    

    Assuming the above, you can print out all names (including null names) as follows:

    File file = new File("C:/tmp/Book2.xml");
    FileInputStream fis = new FileInputStream(file);
    DocumentBuilderFactory builderFactory = DocumentBuilderFactory.newInstance();
    DocumentBuilder builder = builderFactory.newDocumentBuilder();
    Document xmlDocument = builder.parse(fis);
    XPath xPath = XPathFactory.newInstance().newXPath();
    NodeList bookNodes = (NodeList) xPath.compile("//item[@key='book']")
            .evaluate(xmlDocument, XPathConstants.NODESET);
    
    List<String> authors = new ArrayList();
    
    for (int i = 0; i < bookNodes.getLength(); i++) {
        Node bookNode = bookNodes.item(i);
        Node authorNode = (Node) xPath.compile("./item[@key='author']")
                .evaluate(bookNode, XPathConstants.NODE);
    
        if (authorNode == null) {
            authors.add(null);
        } else {
            String s = authorNode.getTextContent().trim();
            authors.add(s.isEmpty() ? null : s);
        }
    }
    System.out.println(authors);
    

    The final print statement gives this:

    [John Smith, null, null, Jane Jones]
    

    Additional notes:

    This loops through all the <item key="book"> sections in the file. For each section, it then performs this targeted search, but only within that section:

    Node authorNode = (Node) xPath.compile("./item[@key='author']")
            .evaluate(bookNode, XPathConstants.NODE);
    

    The evaluate uses the current bookNode as its starting point.

    After that, we can check all the possible outcomes:

    • we found a key="author" tag - and it contains an author's name.
    • we found a key="author" tag - but there is no name in it.
    • there is no key="author" tag for this book node.