I wanted to extract the data from tags which is coming in two forms :
<td><div><font> Something else</font></div></td>
and
<td><div><font> Something <br/>else</font></div></td>
I am using .string()
method where in the first case it gives me the required string (Something else
) but in the second case, it gives me None
.
Is there any better way or alternative way to do it?
Try using .text
property instead of .string
from bs4 import BeautifulSoup
html1 = '<td><div><font> Something else</font></div></td>'
html2 = '<td><div><font> Something <br/>else</font></div></td>'
if __name__ == '__main__':
soup1 = BeautifulSoup(html1, 'html.parser')
div1 = soup1.select_one('div')
print(div1.text.strip())
soup2 = BeautifulSoup(html2, 'html.parser')
div2 = soup2.select_one('div')
print(div2.text.strip())
which outputs:
Something else
Something else