Search code examples
pythonhtmlbeautifulsouphtml-parsing

Find previous occurrence of an element


I have the following html:

<h4>Testing</h4>
<h3>Test</h3>
<h3>Test2</h3>
<h4>Testing2</h4>

If I have the element <h3>Test2</h3> referenced in a variable, how can I find <h4>Testing</h4>? The one before the referenced element, not after.


Solution

  • Use .previous_sibling:

    element.previous_sibling
    

    Or, .find_previous_sibling() to explicitly find the first preceding h4 tag:

    element.find_previous_sibling('h4')
    

    Demo:

    >>> from bs4 import BeautifulSoup
    >>> data = """
    ... <h4>Testing</h4>
    ... <h3>Test</h3>
    ... <h3>Test2</h3>
    ... <h4>Testing2</h4>
    ... """
    >>> soup = BeautifulSoup(data)
    >>> element = soup.find('h3', text='Test')
    >>> element.find_previous_sibling('h4')
    <h4>Testing</h4>