I want to obtain the body of all elements that do have a specific class.
Python xml.dom.minidom has a method for getting an element by id, getElementById()
but I need to get all elements that do have a specific class.
How do I obtain this?
Note, if this is not possible using minidom, please provide a simple alternative that would allow me to get the full content of the elements of this class. By full content I mean also all the subnodes and text below them, as a simple string.
I recommended you to use lxml instead of xml.dom.minidom.
Using lxml.html / cssselect:
import lxml.html
root = lxml.html.fromstring(document_string)
for elem in root.cssselect('elem.class'):
print(elem.tag)
print(elem.get('src'))
Using lxml.etree / xpath:
import lxml.etree
root = lxml.etree.fromstring(document_string)
for elem in root.xpath('.//elem[contains(@class, "class")]'):
print(elem.tag)
print(elem.get('src'))