I want to parse only one span tag in my html document. There are three sibling span tags without any class or I'd. I am targeting the second one only using BeautifulSoup 4.
Given the following html document:
<div class="adress">
<span>35456 street</span>
<span>city, state</span>
<span>zipcode</span>
</div>
I tried:
for spn in soup.findAll('span'):
data = spn[1].text
but it didn't work. The expected result is the text in the second span stored in a a variable:
data = "city, state"
and how to to get both the first and second span concatenated in one variable.
You are trying to slice an individual span
(a Tag
instance). Get rid of the for
loop and slice the findAll
response instead, i.e.
>>> soup.findAll('span')[1]
<span>city, state</span>
You can get the first and second tags together using:
>>> soup.findAll('span')[:2]
[<span>35456 street</span>, <span>city, state</span>]
or, as a string:
>>> "".join([str(tag) for tag in soup.findAll('span')[:2]])
'<span>35456 street</span><span>city, state</span>'