I'm trying to get the second span value in this div and others like it (shown below)
<div class="C(#959595) Fz(11px) D(ib) Mb(6px)">
<span>VALUE 1</span>
<i aria-hidden="true" class="Mx(4px)">•</i>
<span>TRYING TO GET THIS</span>
</div>
I've tried looking at similar stack posts, but I still couldn't figure out how to fix this. Here's my current code:
time = soup.find_all('div', {'class': 'C(#959595) Fz(11px) D(ib) Mb(6px)'})
for i in time:
print(i.text) #this prints VALUE 1 x amount of times (there are multiple divs)
I've tried things like i.span, i.contents, i.children, etc. I'd really appreciate any help, thanks!
There are several ways to get the value you want.
from simplified_scrapy.simplified_doc import SimplifiedDoc
html='''
<div class="C(#959595) Fz(11px) D(ib) Mb(6px)">
<span>VALUE 1</span>
<i aria-hidden="true" class="Mx(4px)">•</i>
<span>TRYING TO GET THIS</span>
</div>
'''
doc = SimplifiedDoc(html)
divs = doc.getElementsByClass('C(#959595) Fz(11px) D(ib) Mb(6px)')
for div in divs:
value = div.getElementByTag('span',start='</span>') # Use start to skip the first
print (value)
value = div.getElementByTag('span',before='<span>',end=len(div.html)) # Locate the last
print (value)
value = div.i.next # Use <i> to locate
print (value)
value = div.spans[-1]
print (value)
print (value.text)
Result:
{'tag': 'span', 'html': 'TRYING TO GET THIS'}
{'tag': 'span', 'html': 'TRYING TO GET THIS'}
{'tag': 'span', 'html': 'TRYING TO GET THIS'}
{'tag': 'span', 'html': 'TRYING TO GET THIS'}
TRYING TO GET THIS