Search code examples
htmlxpathsiblingsscraper

XPath:: Get following Sibling


I have following HTML Structure: I am trying to build a robust method to extract second color digest element since there will be many of these tag within the DOM.

<table>
  <tbody>
    <tr bgcolor="#AAAAAA">
    <tr>
    <tr>
    <tr>
    <tr>
      <td>Color Digest </td>
      <td>AgArAQICGQMVBBwTIRQHIwg0GUMURAZTBWQJcwV0AoEDAQ </td>
    </tr>
    <tr>
      <td>Color Digest </td>
      <td>2,43,2,25,21,28,0,0,0,0,0,0,0,0,0,0,0,0,0,0,33,7,0,0,0,0,0,0,0,0,0,0,0,0,0,0,8,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,25,0,0,0,0,0,0,0,0,0,0,0,0,0,0,20,6,0,0,0,0,0,0,0,0,0,0,0,0,0,0,5,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,9,0,0,0,0,0,0,0,0,0,0,0,0,0,0,5,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, </td>
    </tr>
  </tbody>
</table>

I am trying to extract the Second "Color Digest" td element that has the decoded value.

I wrote the following xpath but instead of getting the second i am not getting the second td element.

//td[text() = ' Color Digest ']/following-sibling::td[2]

And when I change it to td[2] to td[1] I get both the elements.


Solution

  • You should be looking for the second tr that has the td that equals ' Color Digest ', then you need to look at either the following sibling of the first td in the tr, or the second td.

    Try the following:

    //tr[td='Color Digest'][2]/td/following-sibling::td[1]
    

    or

    //tr[td='Color Digest'][2]/td[2]