python selenium xpath css-selectors webdriverwait

Unable to click Next button using selenium as number of pages are unknown

I am new to selenium and trying to scrape:-

https://www.asklaila.com/search/Delhi-NCR/-/book-distributor/

I need all the details mentioned on this page an others as well.

Also, there are certain more pages containing the same information, need to scrape them as well. I try to scrape by making changes to the target URL:-

https://www.asklaila.com/search/Delhi-NCR/-/book-distributor/40

but the last item is changing and is not even similar to the page number. Page number 3 is having 40 at the end and page number 5:-

https://www.asklaila.com/search/Delhi-NCR/-/book-distributor/80

so not able to get the data through that.

Here is my code:-

def extract_url():
    url = driver.find_elements(By.XPATH,"//h2[@class='resultTitle']//a")
    for i in url:
        dist.append(i.get_attribute("href"))
        
    driver.execute_script("window.scrollTo(0,document.body.scrollHeight)")
    
    driver.find_element(By.XPATH,"//li[@class='btnNextPre']//a").click()

for _ in range(10):
    extract_url()

working fine till page 5 but not after that. Could you please suggest how can I iterate over pages where the we don't know the number of pages and can extract data till teh last page.

Solution

You need the check the pagination link is disabled. Use infinite loop and check for pagination button is disabled.

Use WebDriverWait() and wait for visibility of the element.

Code:

driver.get("https://www.asklaila.com/search/Delhi-NCR/-/book-distributor/")
counter=1
while(True):        
    WebDriverWait(driver,20).until(EC.visibility_of_element_located((By.CSS_SELECTOR,"h2.resultTitle >a")))
    urllist=[item.get_attribute('href') for item in driver.find_elements(By.CSS_SELECTOR, "h2.resultTitle >a")]
    print(urllist)
    print("Page number :" +str(counter))    
    driver.execute_script("arguments[0].click();", driver.find_element(By.CSS_SELECTOR, "ul.pagination >li.btnNextPre>a"))    
    #check for pagination button disabled
    if len(driver.find_elements(By.XPATH, "//li[@class='disabled']//a[text()='>']"))>0:
        print("pagination not found!!!")
        break
    time.sleep(2) #To slowdown the loop
    counter=counter+1

import below libraries.

from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
import time