Search code examples
pythonseleniumselenium-webdriverselenium-chromedriverpython-webbrowser

Python Selenium click load more on table


I am trying to get the whole data of this table. However, in the last row there is "Load More" table row that I do not know how to load. So far I have tried different approaches that did not work,

  1. I tried to click on the row itself by this:
from selenium import webdriver
driver = webdriver.Chrome()
driver.get(url)
soup = BeautifulSoup(driver.page_source, 'html.parser')

table = soup.find('table', {"class": "competition-leaderboard__table"})

i = 0
for team in table.find.all('tbody'):
    rows = team.find_all('tr')
    for row in rows:
        i = i + 1
        if (i == 51):
            row.click()

        //the scraping code for the first 50 elements
        

The code above throws an error saying that "'NoneType' object is not callable".

Another thing that I have tried that did not work is the following: I tried to get the load more table row by its' class and click on it.

from selenium import webdriver
driver = webdriver.Chrome()
driver.get(url)

load_more = driver.find_element_by_class_name('competition-leaderboard__load-more-wrapper')
load_more.click()

soup = BeautifulSoup(driver.page_source, 'html.parser')

The code above also did not work.

So my question is how can I make python click on the "Load More" table row as in the HTML structure of the site it seems like "Load More" is not a button that is clickable.


Solution

  • In your code you have to accept cookies first, and then you can click 'Load more' button.

    CSS selectors are the most suitable in this case.

    import time
    
    from selenium import webdriver
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support.wait import WebDriverWait
    from selenium.webdriver.support import expected_conditions as EC
    
    driver = webdriver.Chrome(executable_path='/snap/bin/chromium.chromedriver')
    driver.implicitly_wait(10)
    driver.get('https://www.kaggle.com/c/coleridgeinitiative-show-us-the-data/leaderboard')
    
    wait = WebDriverWait(driver, 30)
    wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, ".sc-pAyMl.dwWbEz .sc-AxiKw.kOAUSS>.sc-AxhCb.gsXzyw")))
    cookies = driver.find_element_by_css_selector(".sc-pAyMl.dwWbEz .sc-AxiKw.kOAUSS>.sc-AxhCb.gsXzyw").click()
    load_more = driver.find_element_by_css_selector(".competition-leaderboard__load-more-count").click()
    time.sleep(10)  # Added for you to make sure that both buttons were clicked
    driver.close()
    driver.quit()
    

    I tested this snippet and it clicked the desired button. Note that I've added WebDriverWait in order to wait until the first button is clickable.

    UPDATE: I added time.sleep(10) so you could see that both buttons are clicked.