I'm currently trying to do some Selenium webscraping but I keep running into this error:
StaleElementReferenceException: Message: stale element reference: element is not attached to the page document
The code is supposed to, on http://www.grownjkids.gov/ParentsFamilies/ProviderSearch, continually click the next button (">") of results and scrape the results from every page in a loop. It'll do this properly for a few pages but will sporadically fail on a random page with the above exception.
I've already looked at numerous StackOverflow posts with similar concerns and tried some of the proposed fixes, such as using the WebDriverWait class to implement an explicit wait, using try/except blocks to loop and refind the element using driver.find_element... method on the condition that a StaleElementReferenceException occurs, and trying both
driver.find_element_by_id
and
driver.find_element_by_xpath.
Below is my code :
url = "http://www.grownjkids.gov/ParentsFamilies/ProviderSearch"
driver = webdriver.Chrome('MY WEBDRIVER FILE PATH')
driver.implicitly_wait(10)
driver.get(url)
#clears text box
driver.find_element_by_class_name("form-control").clear()
#clicks on search button without putting in any parameters, getting all the results
search_button = driver.find_element_by_id("searchButton")
search_button.click()
#function to find next button
def find(driver):
try:
element = driver.find_element_by_class_name("next")
if element:
return element
except StaleElementReferenceException:
while (attempts < 100):
element = driver.find_element_by_class_name("next")
if element:
return element
attempts += 1
#keeps on clicking next button to fetch each group of 5 results
while True:
try:
nextButton = WebDriverWait(driver, 2000).until(find)
except NoSuchElementException:
break
nextButton.send_keys('\n')
table = driver.find_element_by_id("results")
html_source = table.get_attribute('innerHTML')
print html_source
I have a hunch increasing the WebDriverWait to 2000 and looping 100 attempts is not really working (perhaps it's not going into that block?) because the results are the same regardless of how much I increase it. Any feedback on my code is appreciated as well since this is my first time using Selenium and I'm fairly new to python as well.
StaleElementReferenceException occurs when web driver is trying to perform action on the element which is no longer exists or not valid.
I have added the fluent wait to your code for an element to be available for clicking, try the below code :
from selenium import webdriver
from selenium.webdriver.support.wait import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import StaleElementReferenceException, WebDriverException, NoSuchElementException
from selenium.webdriver.common.by import By
driver= webdriver.Chrome('C:\NotBackedUp\chromedriver.exe')
url = "http://www.grownjkids.gov/ParentsFamilies/ProviderSearch"
driver.get(url)
#clears text box
driver.find_element_by_class_name("form-control").clear()
#clicks on search button without putting in any parameters, getting all the results
search_button = driver.find_element_by_id("searchButton")
search_button.click()
#keeps on clicking next button to fetch each group of 5 results
i=1
while True:
wait = WebDriverWait(driver, timeout=1000, poll_frequency=1, ignored_exceptions=[StaleElementReferenceException, WebDriverException]);
try:
element = wait.until(EC.element_to_be_clickable((By.CLASS_NAME, 'next')))
element.click()
print("Clicked ===> ", i)
i+=1
except NoSuchElementException:
break
table = driver.find_element_by_id("results")
html_source = table.get_attribute('innerHTML')
print html_source
Fluent wait will try to click on the next symbol by ignoring the StaleElementReferenceException and WebDriverException exceptions.
And the loop will break when you get NoSuchElementException exception.
I hope it helps...