Search code examples
javaxmlxpathjavax.xml

how to retrieve XML data using XPath which has a default namespace in Java?


I've come across and problem that I've looked up on stack overflow but none of the solutions seems to solve the problem for me.

I'm retrieving XML data from Yahoo and it comes back as below (truncated for brevity's sake).

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<fantasy_content xmlns="http://fantasysports.yahooapis.com/fantasy/v2/base.rng" xmlns:yahoo="http://www.yahooapis.com/v1/base.rng" copyright="Data provided by Yahoo! and STATS, LLC" refresh_rate="31" time="55.814027786255ms" xml:lang="en-US" yahoo:uri="http://fantasysports.yahooapis.com/fantasy/v2/league/328.l.108462/settings">
    <league>
        <league_key>328.l.108462</league_key>
        <league_id>108462</league_id>
        <draft_status>postdraft</draft_status>
    </league>
</fantasy_content>

I've been having a problem getting XPath to retrieve any elements so I've written a unit test to try to resolve it and it looks like:

    final File file = new File("league-settings.xml");
    javax.xml.parsers.DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
    dbFactory.setNamespaceAware(true);
    javax.xml.parsers.DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
    org.w3c.dom.Document doc = dBuilder.parse(file);
    javax.xml.xpath.XPath xPath = XPathFactory.newInstance().newXPath();
    xPath.setNamespaceContext(new YahooNamespaceContext());
    final String expression = "yfs:league";
    final XPathExpression expr = xPath.compile(expression);
    Object nodes = expr.evaluate(doc, XPathConstants.NODESET);

    assert(nodes instanceof NodeList);
    NodeList leagueNodes = (NodeList)nodes;
    int leaguesLength = leagueNodes.getLength();
    assertEquals(leaguesLength, 1);

The YahooNamespaceContext class I created to map the namespaces looks as follows:

public class YahooNamespaceContext implements NamespaceContext {
    public static final String YAHOO_NS = "http://www.yahooapis.com/v1/base.rng";
    public static final String DEFAULT_NS = "http://fantasysports.yahooapis.com/fantasy/v2/base.rng";
    public static final String YAHOO_PREFIX = "yahoo";
    public static final String DEFAULT_PREFIX = "yfs";

    private final Map<String, String> namespaceMap = new HashMap<String, String>();
    public YahooNamespaceContext() {
        namespaceMap.put(DEFAULT_PREFIX, DEFAULT_NS);
        namespaceMap.put(YAHOO_PREFIX, YAHOO_NS);
    }

    public String getNamespaceURI(String prefix) {
        return namespaceMap.get(prefix);
    }

    public String getPrefix(String uri) {
        throw new UnsupportedOperationException();
    }

    public Iterator<String> getPrefixes(String uri) {
        throw new UnsupportedOperationException();
    }

}

Any help with people with more experience with XML namespaces or debugging tips into Xpath compilation/evaluation would be appreciated.


Solution

  • If the problem is that you're getting zero as the length of the result nodelist, have you tried changing

    final String expression = "yfs:league";
    

    to

    final String expression = "//yfs:league";
    

    ?

    It appears that the context for evaluating your XPath expressions, doc, is the root node of the document. dBuilder.parse(file) returns the document root node, not the outermost element (a.k.a. document element). Remember, in XPath, a root node is not an element. So doc is not the yfs:fantasy_content element node but is its (invisible) parent.

    In that context, the XPath expression "yfs:league" will only select an element that is a direct child of that root node, of which there is no yfs:league -- only yfs:fantasy_content.