Search code examples
python-3.xselenium-webdriverxpathwebdriverwaitdopostback

How to paginate through the page numbers when href contains javascript:__doPostBack()


I'm trying to scrape this website http://www.mfa.gov.tr/sub.ar.mfa?dcabec54-44b3-4aaa-a725-70d0caa8a0ae but when I want to go to next page I can't because the link doesn't change you will find that pages links are like that

href="javascript:__doPostBack('sb$grd','Page$1')"

I have a code that I tried but it only goes to page 2 and then gave me an error: tale element reference: element is not attached to the page document

from selenium import webdriver
url = 'http://www.mfa.gov.tr/sub.ar.mfa?dcabec54-44b3-4aaa-a725-70d0caa8a0ae'
driver = webdriver.Chrome()
driver.get(url)
btn = [w for w in driver.find_elements_by_xpath('//*[@id="sb_grd"]/tbody/tr[26]/td/table/tbody/tr/td/a')]
for b in btn:
    driver.execute_script("arguments[0].click();", b)

Solution

  • To paginate through the page numbers with href attribute as "javascript:__doPostBack('sb$grd','Page$2')" you need to induce WebDriverWait for the element_to_be_clickable() and you can use the following Locator Strategies:

    • Code Block:

      from selenium import webdriver
      from selenium.webdriver.common.by import By
      from selenium.webdriver.support.ui import WebDriverWait
      from selenium.webdriver.support import expected_conditions as EC
      from selenium.common.exceptions import TimeoutException
      
      options = webdriver.ChromeOptions() 
      options.add_argument("start-maximized")
      options.add_experimental_option("excludeSwitches", ["enable-automation"])
      options.add_experimental_option('useAutomationExtension', False)
      driver = webdriver.Chrome(options=options, executable_path=r'C:\Utility\BrowserDrivers\chromedriver.exe')
      driver.get("http://www.mfa.gov.tr/sub.ar.mfa?dcabec54-44b3-4aaa-a725-70d0caa8a0ae")
      while True:
          try:
              WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//table[@id='sb_grd']//table/tbody/tr//td/span//following::td[1]/a"))).click()
              print("Next page clicked")
          except TimeoutException:
              print("No more pages")
              break
      driver.quit()
      
    • Console Output:

      Next page clicked
      Next page clicked
      Next page clicked
      .
      .
      .
      No more pages
      

    Reference

    You can find a relevant detailed discussion in: