Search code examples
javahtmlxpathjsoup

Select element value from html via XPath


I've got a html element that looks like this:

<p>
<strong>Popular resonses: </strong>
bat, butterfly, moth
</p>

Html contains about all elements with <p> tag.

I need to extract <p> values (bat, butterfly, moth).

Thanks.

P.S

I've tried to parse with Matcher and Pattern but it did'n work. I'm using JSoup as parsing library.


Solution

  • You can get your desired text by using -

    Elements el = doc.select("p:has(strong)");
        for (Element e : el) {          
            System.out.println(e.ownText());
        }
    

    This will find all the p elements in the html that contains also strong, and print the text that belongs only to the p but not to the strong -

    bat, butterfly, moth

    If you use e.text() instead, you will get all the text in the p element -

    Popular resonses: bat, butterfly, moth

    If you have only one such element you can also use -

    Element e = doc.select("p:has(strong)").first();
    System.out.println(e.ownText());
    

    Which saves you the loop.