Search code examples
pythonseleniumtagsflat

Obtain HTML between two tags using Selenium


Set Up

I'm using Selenium to obtain a set of links on a page.

The page HTML structure is 'flat'; no indentations, no children, etc. and looks like,

 <h2>TAG1</h2>
 <a href...>...</a>
 'more links'
 <a href...>...</a>
 <h2>TAG2</h2>

Problem

The links I want to obtain are located between (not inside) the two h2 tags.

How do I tell selenium to obtain the HTML (or directly the links) between TAG1 and TAG2?


Solution

  • This xpath should do the trick

    //a[./preceding-sibling::h2[.='TAG1']][./following-sibling::h2[.='TAG2']]
    

    The xpath reads as select all a tags which has h2 with text TAG 1 preceding it and h2 with text TAG2 following it.