Search code examples
pythonhtmlpython-3.xseleniumselenium-webdriver

Finding an element by partial href (Python Selenium)


I'm trying to access text from elements that have different xpaths but very predictable href schemes across multiple pages in a web database. Here are some examples:

<a href="/mathscinet/search/mscdoc.html?code=65J22,(35R30,47A52,65J20,65R30,90C30)">
65J22 (35R30 47A52 65J20 65R30 90C30) </a>

In this example I would want to extract "65J22 (35R30 47A52 65J20 65R30 90C30)"

<a href="/mathscinet/search/mscdoc.html?code=05C80,(05C15)">
05C80 (05C15) </a>

In this example I would want to extract "05C80 (05C15)". My web scraper would not be able to search by xpath directly due to the xpaths of my desired elements changing between pages, so I am looking for a more roundabout approach.

My main idea is to use the fact that every href contains "/mathscinet/search/mscdoc.html?code=". Selenium can't directly search for hrefs, but I was thinking of doing something similar to this C# implementation:

Driver.Instance.FindElement(By.XPath("//a[contains(@href, 'long')]"))

To port this over to python, the only analogous method I could think of would be to use the in operator, but I am not sure how the syntax will work when everything is nested in a find_element_by_xpath. How would I bring all of these ideas together to obtain my desired text?

driver.find_element_by_xpath("//a['/mathscinet/search/mscdoc.html?code=' in @href]").text

Solution

  • If I right understand you want to locate all elements, that have same partial href. You can use this:

    elements = driver.find_elements_by_xpath("//a[contains(@href, '/mathscinet/search/mscdoc.html')]")
    for element in elements:
        print(element.text)
    

    or if you want to locate one element:

    driver.find_element_by_xpath("//a[contains(@href, '/mathscinet/search/mscdoc.html')]").text
    

    This will give a list of all elements located.