I have this html:
<html lang="en" class="no-js">
<div>
<p class="price ">
3.75
</p>
<p>21</p>
</div>
</html>
I want to get the class of this
The problem is what ever I do to try to get it, every time he comes without the space.
current_element.get('class')...
Even str(current_element) come like this:
'<p class="price">3.75</p>'
How can I get the text of the class in raw? Or something like that? Regex of all the html is not a option cuz I can have htmls with 11k of lines and more
Thanks!
If you use the keyword argument multi_valued_attributes=None
in your beautifulsoup constructor you will get the class string with the space.
(Source: https://beautiful-soup-4.readthedocs.io/en/latest/#multi-valued-attributes )
You will however lose the functionality of accessing multi-value attributes (such as class
) as lists
from bs4 import BeautifulSoup
html = """<html lang="en" class="no-js">
<div>
<p class="price ">
3.75
</p>
<p>21</p>
</div>
</html>"""
soup = BeautifulSoup(html, multi_valued_attributes=None)
soup.html.div.p["class"]
Result:
'price '