Search code examples
javajsouphtml-parsing

Jsoup: get data from table


I would like to get 71–85 from a web page using Jsoup. After some trial-and-error, I was able to do it by using the following code:

document.select("#row_13 > div.row-desc > div").text();

I don't think this is a good solution, thought. I'm using here id which is a part of a table. A better approach would be perhaps to get what appears after Pages. However, I have no clue whatsoever how to approach this. Any help would be much welcome!

<div style="" class="  row row-even" id="row_13">
   <div class="row-label" >
      <div class="white_label ">Pages</div>
   </div>
   <div class="row-desc">
      <div class="white_desc " style="width: 100%">
         71–85
      </div>
   </div>
</div>

EDI Here's the page the above content is taken from: http://cejsh.icm.edu.pl/cejsh/element/bwmeta1.element.desklight-dc6b14c1-8478-426a-8f2e-bb5636e0a5e9


Solution

  • A better approach would be perhaps to get what appears after Pages.

    First, select the element with class row-label that contains text "Pages". Then, select the first sibling of that element and fetch its text.

    You can do so like this:

    document.select(".row-label:contains(Pages) + *").first().text();
    

    You can even omit .first() if you know for a fact there's only one element with class row-label that contains the text "Pages":

    document.select(".row-label:contains(Pages) + *").text();
    

    Refer to the Selector syntax for info on notation used.