Here's the outhtml of an element on a webpage
<td valign="top">
<script type="text/javascript">sjcap();</script><p><input type="text" id="uword" name="uword" class="" size="20"></p><p><img src="/wps/PA_1_ATAGT15208O2F02M34340U0000/./cimg/31.jpg" width="290" height="80" alt=""></p>
</td>
I am trying to build xpath for the image and extract the src attribute using HTMLSession requests_html
Here's my xpath but this didn't match the element //input[@id='uword']/following-sibling::p
I inspected the element and try to use Ctrl + F to find the xpath but I got 0 results
The html in your question is not well formed xml (the <input>
and <img>
elements aren't closed). Second, the <p>
element containing the <img>
child is not a sibling of the <input>
tag, but of that tag's <p>
parent. Assuming the html is fixed like this:
<td valign="top">
<script type="text/javascript">sjcap();</script>
<p>
<input type="text" id="uword" name="uword" class="" size="20"/>
</p>
<p>
<img src="/wps/PA_1_ATAGT15208O2F02M34340U0000/./cimg/31.jpg" width="290" height="80" alt=""/>
</p>
</td>
The following xpath
//p[./input[@id="uword"]]/following-sibling::p/img/@src
or
//p/input[@id="uword"]/../following-sibling::p/img/@src
should output
/wps/PA_1_ATAGT15208O2F02M34340U0000/./cimg/31.jpg