Search code examples
c++xmlxpathlibxml2

How to use XPath in XMLReader API of libxml2?


The tutorial here says we can used XPath in XMLReader API if we expand the current node and set it to be the context node of the xmlXPathContext object. Unfortunately, the example the tutorial provides is in Python, which language I don't understand at all. I tried to create my own example in C++, but got stuck. The problem is function xmlXPathSetContextNode always fails. Below is my codes and a XML document to be read by the example application.

// BUILD: g++ thisFile.cpp -std=c++11 -Wall $(xml2-config --cflags --libs)
#include <cstdio> // for function fopen, fseek, rewind, fread and fclose
#include <libxml/xmlreader.h> // for data type xmlTextReader
#include <libxml/xpath.h> // for data type xmlXPathContext
#include <memory> // for class template shared_ptr
#include <stdexcept> // for class runtime_error;
#define _X(s) ((const xmlChar *)s)
int main(int argc, char *argv[])
{
    using std::shared_ptr;
    // Create a text reader.
    shared_ptr<xmlTextReader> reader(::xmlReaderForFile("sample.xml", NULL, 0), &::xmlFreeTextReader);
    // Create a XPath context.
    xmlDocPtr doc = ::xmlTextReaderCurrentDoc(reader.get());
    shared_ptr<xmlXPathContext> ctxt(::xmlXPathNewContext(doc), &::xmlXPathFreeContext);
    // Use the text reader to read the stream.
    int ret;
    try {
        while ((ret = ::xmlTextReaderRead(reader.get())) == 1) {
            // Ignore all nodes except <storyinfo>.
#if 0
            xmlNodePtr node = ::xmlTextReaderCurrentNode(reader.get());
#else
            xmlNodePtr node = ::xmlTextReaderExpand(reader.get());
#endif
            if (::xmlStrncmp(node->name, _X("storyinfo"), 10) != 0) continue;
            // Set the current node as the context node.
            ::printf("node: 0x%08X\n", (size_t)node);
            if (::xmlXPathSetContextNode(node, ctxt.get()) == -1) {
                ::fprintf(stderr, "ERROR(%d): %s\n", ctxt->lastError.code, ctxt->lastError.message);
                throw std::runtime_error("err_xpath_set_context");
            }
            // Use a XPath to find <datewritten>.
            shared_ptr<xmlXPathObject> xpathFound(::xmlXPathEvalExpression(_X("datewritten"), ctxt.get()), &::xmlXPathFreeObject);
            if (xmlXPathNodeSetGetLength(xpathFound->nodesetval) == 0) throw std::runtime_error("err_xpath_not_fonud");
            shared_ptr<xmlChar> zTextContent(::xmlXPathCastToString(xpathFound.get()), ::xmlFree);
            ::printf("found: %s\n", zTextContent.get());
            break;
        }
        if (ret == -1) {
            ::fprintf(stderr, "ERROR: %s\n", "xmlTextReaderRead failure!");
            return 1;
        }
    }
    catch (const std::runtime_error& e) {
        ::fprintf(stderr, "ERROR: %s\n", e.what());
    }
    // Exit the program.
    return 0;
}

The contents of the XML document is

<?xml version="1.0"?>
<story>
    <storyinfo>
        <author>John Fleck</author>
        <datewritten>June 2, 2002</datewritten>
        <keyword>example keyword</keyword>
    </storyinfo>
    <body>
        <headline>This is the headline</headline>
        <para>This is the body text.</para>
    </body>
</story>

Any hint will be appreciated. Thanks in advance. m(_ _)m


Solution

  • Problem solved. function xmlTextReaderCurrentDoc shall not be called before any xmlTextReaderRead. I shall have checked the return value of xmlTextReaderCurrentDoc. In the above buggy codes, it returns NULL. Consequently, no valid XPath context can be obtained. The official document doesn't mention when is the right time to invoke xmlTextReaderCurrentDoc, so I'll leave this q&a here for other people who encounter the same problem to google.