I have a table and want to extract data from some data cells.
<table>
<tr>
<td class="label"> </td>
<td class="data"><p><a href="http://en.wikipedia.org/wiki/Liu_Kang"><img src="http://upload.wikimedia.org/wikipedia/en/e/e2/LiuKangshaolinmonks.jpg"/></a></p>
</td>
</tr>
<tr>
<td class="label">First game</td>
<td class="data">Mortal Kombat (1992)</td>
</tr>
<tr>
<td class="label">Created by</td>
<td class="data">John Tobias</td>
</tr>
<tr>
<td class="label">Orgin</td>
<td class="data">Earthrealm</td>
</tr>
<tr>
<td class="label">Weapon</td>
<td class="data">Nunchaku</td>
</tr>
<tr>
<td class="label">Colour</td>
<td class="data">Red</td>
</tr>
</table>
I would like to extract Nunchaku
, this works:
/html/body//tr[5]/td[@class="data"]
But I would rather like to skip tr[5]
and instead use td[contains(., 'Weapon')]
but I am unsure how.
You need to use following-sibling::
//td[contains(., 'Weapon')]/following-sibling::td
Checkout this stack overflow question or read some documentation for more information about following-sibling.