Search code examples
htmlxpathxpathquery

Find specific element position in XPath after checking a condition


I have the following html I am working with: (a chunk of it here)

<table class="detailTable">
  <tbody>
    <tr>
      <td class="detailTitle" align="top">
        <h3>Credit Limit:</h3>
        <h3>Current Balance:</h3>
        <h3>Pending Balance:</h3>
        <h3>Available Credit:</h3>
      </td>
      <td align="top">
        <p>$677.77</p>
        <p>$7.77</p>
        <p>$7.77</p>
        <p>$677.77</p>
      </td>
      <td class="detailTitle">
        <h3>Last Statement Date:</h3>
        <h4>Payment Address</h4>
      </td>
      <td>
        <p>   05/19/2015  </p>
        <p class="attribution">
      </td>
    </tr>
  </tbody>
</table>

I need to first check if "Statement Date" exists, and then find its position. Then get it's value which is in a corresponding <p> tag. I need to do this using XPath. Any suggestions?

So far I tried using //table[@class='detailTable'][1]//td[2]//p[position(td[contains(.,'Statement Date')])] but it doesn't work.


Solution

  • This is one possible way : (formatted for readability)

    //table[@class='detailTable']
    //tr
    /td[*[contains(.,'Statement Date')]]
    /following-sibling::td[1]
    /*[position() 
          = 
        count(
            parent::td
            /preceding-sibling::td[1]
            /*[contains(.,'Statement Date')]/preceding-sibling::*
        )+1
      ]
    

    explanation :

    • ..../td[*[contains(.,'Statement Date')]] : From the beginning up to this part, the XPath will find td element where, at least, one of its children contains text "Statement Date"
    • /following-sibling::td[1] : from previously matched td, navigate to the nearest following sibling td ...
    • /*[position() = count(parent::td/preceding-sibling::td[1]/*[contains(.,'Statement Date')]/preceding-sibling::*)+1] : ...and return child element at position equals to position of element that contains text "Statement Date" in the previous td. Notice that we use count(preceding-sibling::*)+1 to get position index of the element containing text "Statement Date" here.