Search code examples
python-3.xbeautifulsouppython-requestspython-beautifultable

What should I do if a class name is present in another class name?


I tried to scrape some data from an e-commerce site and I needed the discount percentage of products which were in a span tag inside a div tag having a class name " VGWI6T" But it also gave me those products discount percentage with class name " VGWI6T _2YXy_Y".

<div>
.......
.......
......
<div class= "VGWI6T">

  <span>25% off</span>

</div>
.....
.....
.....
</div>

.........
...........
......

<div>
....
....
....
<div class= "VGWI6T _2YXy_Y">

  <span>25% off</span>
</div>
....
.....
</div>

How can I grab ONLY those products with the former class name(VGWI6T)? When I am doing:

Discount = bs.find_all('div',class_='VGWI6T', attars= 'span')

it is giving me all the discounts of products even if they belong to the VGWI6T _2YXy_Y class.


Solution

  • Use css selector and class not contains _2YXy_Y

    from bs4 import BeautifulSoup
    html='''<div>
    .......
    .......
    ......
    <div class= "VGWI6T">
    
      <span>25% off</span>
    
    </div>
    .....
    .....
    .....
    </div>
    
    .........
    ...........
    ......
    
    <div>
    ....
    ....
    ....
    <div class= "VGWI6T _2YXy_Y">
    
      <span>25% off</span>
    </div>
    ....
    .....
    </div>'''
    
    soup=BeautifulSoup(html,"html.parser")
    for item in soup.select(".VGWI6T:not(._2YXy_Y) span "):
        print(item.text)