I've been parsing an ol
element of html and came across a problem with indexing of elements.
Let assume we have the following element:
html_document = """
<ol>
<li>Test lists</li>
<li>Second option</li>
<li>Third option</li>
</ol>
"""
So, let's parse it:
soup = BeautifulSoup(html_document)
all_li = tuple(soup.find_all('li'))
result = [el.parent.index(el) for el in all_li]
print(result) # [1, 3, 5]
Why 1,3,5? Or I've missed something?
You are using the parent tag.Just use child tag.
html_document = """
<ol>
<li>Test lists</li>
<li>Second option</li>
<li>Third option</li>
</ol>
"""
soup = BeautifulSoup(html_document,'lxml')
all_li = tuple(soup.find_all('li'))
result = [all_li.index(el) for el in all_li]
print(result)
output:
[0, 1, 2]