Search code examples
xpathjsoup

Convert xPath to JSoup query


Does anyone know of an xPath to JSoup convertor? I get the following xPath from Chrome:

 //*[@id="docs"]/div[1]/h4/a

and would like to change it into a Jsoup query. The path contains an href I'm trying to reference.


Solution

  • This is very easy to convert manually.

    Something like this (not tested)

    document.select("#docs > div:eq(1) > h4 > a").attr("href");
    

    Documentation:

    http://jsoup.org/cookbook/extracting-data/selector-syntax


    Related question from comment

    Trying to get the href for the first result here: cbssports.com/info/search#q=fantasy%20tom%20brady

    Code

    Elements select = Jsoup.connect("http://solr.cbssports.com/solr/select/?q=fantasy%20tom%20brady")
            .get()
            .select("response > result > doc > str[name=url]");
    
    for (Element element : select) {
        System.out.println(element.html());
    }
    

    Result

    http://fantasynews.cbssports.com/fantasyfootball/players/playerpage/187741/tom-brady
    http://www.cbssports.com/nfl/players/playerpage/187741/tom-brady
    http://fantasynews.cbssports.com/fantasycollegefootball/players/playerpage/1825265/brady-lisoski
    http://fantasynews.cbssports.com/fantasycollegefootball/players/playerpage/1766777/blake-brady
    http://fantasynews.cbssports.com/fantasycollegefootball/players/playerpage/1851211/brady-foltz
    http://fantasynews.cbssports.com/fantasycollegefootball/players/playerpage/1860955/brady-earnhardt
    http://fantasynews.cbssports.com/fantasycollegefootball/players/playerpage/1673397/brady-amack
    

    Screenshot from Developer Console - grabbing urls

    enter image description here