Search code examples
pythonseleniumfor-loopselenium-webdriverstaleelementreferenceexception

Python Selenium iterate table of links clicking each link


So this question has been asked before but I am still struggling to get it working.

The webpage has a table with links, I want to iterate through clicking each of the links.

enter image description here

So this is my code so far

from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

driver = webdriver.Chrome(executable_path=r'C:\Users\my_path\chromedriver_96.exe')
driver.get(r"https://www.fidelity.co.uk/shares/ftse-350/")

try:
    element = WebDriverWait(driver, 20).until(
        EC.presence_of_element_located((By.CLASS_NAME, "table-scroll")))

    table = element.find_elements_by_xpath("//table//tbody/tr")
 
    for row in table[1:]:
        print(row.get_attribute('innerHTML'))
        # link.click()

finally:
    driver.close()

Sample of output

            <td>FOUR</td>
            <td><a href="/factsheets/4IMPRINT-GROUP/GB0006640972-GBP/?id=GB0006640972GBP&amp;idType=isin&amp;marketCode=&amp;idCurrencyid=" target="_parent">4imprint Group plc</a></td>
            <td>Media &amp; Publishing</td>
        

            <td>888</td>
            <td><a href="/factsheets/888-HOLDINGS/GI000A0F6407-GBP/?id=GI000A0F6407GBP&amp;idType=isin&amp;marketCode=&amp;idCurrencyid=" target="_parent">888 Holdings</a></td>
            <td>Hotels &amp; Entertainment Services</td>
        

            <td>ASL</td>
            <td><a href="/factsheets/ABERFORTH-SMALLER-COMPANIES-TRUST/GB0000066554-GBP/?id=GB0000066554GBP&amp;idType=isin&amp;marketCode=&amp;idCurrencyid=" target="_parent">Aberforth Smaller Companies Trust</a></td>
            <td>Collective Investments</td>


How do a click the href and iterate to the next href?

Many thanks.

edit I went with this solution (a few small tweaks on Prophet's solution)

from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
import time
from selenium.webdriver.common.action_chains import ActionChains


driver = webdriver.Chrome(executable_path=r'C:\Users\my_path\chromedriver_96.exe')
driver.get(r"https://www.fidelity.co.uk/shares/ftse-350/")
actions = ActionChains(driver)
#close the cookies banner
WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.ID, "ensCloseBanner"))).click()
#wait for the first link in the table
WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//table//tbody/tr/td/a")))
#extra wait to make all the links loaded
time.sleep(1)
#get the total links amount
links = driver.find_elements_by_xpath('//table//tbody/tr/td/a') 

for index, val in enumerate(links):
    try:
        #get the links again after getting back to the initial page in the loop
        links = driver.find_elements_by_xpath('//table//tbody/tr/td/a')
        #scroll to the n-th link, it may be out of the initially visible area
        actions.move_to_element(links[index]).perform()
        links[index].click()
        #scrape the data on the new page and get back with the following command
        driver.execute_script("window.history.go(-1)") #you can alternatevely use this as well: driver.back()
        WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//table//tbody/tr/td/a")))
        time.sleep(2)
    except StaleElementReferenceException:  
        pass

Solution

  • To perform what you want to do here you first need to close cookies banner on the bottom of the page.
    Then you can iterate over the links in the table.
    Since by clicking on each link you are opening a new page, after scaring the data there you will have to get back to the main page and get the next link. You can not just get all the links into some list and then iterate over that list since by navigating to another web page all the existing elements grabbed by Selenium on the initial page become Stale.
    Your code can be something like this:

    from selenium import webdriver
    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support import expected_conditions as EC
    import time
    
    
    driver = webdriver.Chrome(executable_path=r'C:\Users\my_path\chromedriver_96.exe')
    driver.get(r"https://www.fidelity.co.uk/shares/ftse-350/")
    actions = ActionChains(driver)
    #close the cookies banner
    WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.ID, "ensCloseBanner"))).click()
    #wait for the first link in the table
    WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//table//tbody/tr/td/a")))
    #extra wait to make all the links loaded
    time.sleep(1)
    #get the total links amount
    links = driver.find_elements_by_xpath('//table//tbody/tr/td/a') 
    for index, val in enumerate(links):
        #get the links again after getting back to the initial page in the loop
        links = driver.find_elements_by_xpath('//table//tbody/tr/td/a')
        #scroll to the n-th link, it may be out of the initially visible area
        actions.move_to_element(links[index]).perform()
        links[index].click()
        #scrape the data on the new page and get back with the following command
        driver.execute_script("window.history.go(-1)") #you can alternatevely use this as well: driver.back()
        WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//table//tbody/tr/td/a")))
        time.sleep(1)