Search code examples
javaandroidjsoup

JSoup, count elements after a specific tag (h3)


I need to know the size of all the p elements after the h3[id=hm_2] tag. Is there a way to accomplish that?

As this is not working, result should be 3. Many thx in advance.

Here is my piece of code:

  for (Element tag : doc.select("div.archive-style-wrapper")) {
    Elements headCat1 = tag.getElementsByTag("h3");
      for (Element headline : headCat1) {
         Elements importantLinks = headline.getElementsByTag("p");
         Log.w("result", String.valueOf(importantLinks.size()));
     }
  }

HTML code piece involved:

 <h3 class="a-header--3" id="hm_2">Some text</h3>
 <p class="a-paragraph"><img src="data:image/gif;base64,R0lGODlhAQ"</img></p>
 <p class="a-paragraph">This new event quest, brought to us by</p>
 <p class="a-paragraph">This new event quest, brought to us by</p>
 <div class="imagelink"... </div>

Solution

  • You can use the selector E ~ F to get an F element preceded by sibling E for example h1 ~ p. Find more about the selector syntax here: Selector.html

    String html = "<h3 class=\"a-header--3\" id=\"hm_2\">Some text</h3>\n"
                  + " <p class=\"a-paragraph\"><img src=\"data:image/gif;base64,R0lGODlhAQ\"</img></p>\n"
                  + " <p class=\"a-paragraph\">This new event quest, brought to us by</p>\n"
                  + " <p class=\"a-paragraph\">This new event quest, brought to us by</p>\n"
                  + " <div class=\"imagelink\"... </div>"
                  + " <p class=\"a-paragraph\">This shall not be included</p>\n";
    
    Document doc = Jsoup.parseBodyFragment(html);
    Elements paragraphs = doc.select("h3[id=hm_2] ~ p");
    
    System.out.println(paragraphs.size());
    paragraphs.forEach(System.out::println);