I am struggling with the syntax required to grab some hrefs in a <td>
.
The <table>
, <tr>
and <td>
elements don't have any classes or ids.
If I wanted to grab the anchor in this example, what would I need?
<tr>
<td>
<a>...</a>
</td>
</tr>
As per the docs, you first make a parse tree:
import BeautifulSoup
html = "<html><body><tr><td><a href='foo'/></td></tr></body></html>"
soup = BeautifulSoup.BeautifulSoup(html)
and then you search in it, for example for <a>
tags whose immediate parent is a <td>
:
for ana in soup.findAll('a'):
if ana.parent.name == 'td':
print ana["href"]