I tried to scrape some data from an e-commerce site and I needed the discount percentage of products which were in a span tag inside a div tag having a class name " VGWI6T" But it also gave me those products discount percentage with class name " VGWI6T _2YXy_Y".
<div>
.......
.......
......
<div class= "VGWI6T">
<span>25% off</span>
</div>
.....
.....
.....
</div>
.........
...........
......
<div>
....
....
....
<div class= "VGWI6T _2YXy_Y">
<span>25% off</span>
</div>
....
.....
</div>
How can I grab ONLY those products with the former class name(VGWI6T)? When I am doing:
Discount = bs.find_all('div',class_='VGWI6T', attars= 'span')
it is giving me all the discounts of products even if they belong to the VGWI6T _2YXy_Y class.
Use css selector and class not contains _2YXy_Y
from bs4 import BeautifulSoup
html='''<div>
.......
.......
......
<div class= "VGWI6T">
<span>25% off</span>
</div>
.....
.....
.....
</div>
.........
...........
......
<div>
....
....
....
<div class= "VGWI6T _2YXy_Y">
<span>25% off</span>
</div>
....
.....
</div>'''
soup=BeautifulSoup(html,"html.parser")
for item in soup.select(".VGWI6T:not(._2YXy_Y) span "):
print(item.text)