Search code examples
pythonseleniumwebdriverwebdriverwait

Parsing a dynamically loaded webpage with Selenium


I'm trying to parse https://www.flashscore.com/football/albania/ using Selenium in Python, but my webdriver often doesn't wait for the scores to finish loading.

Here's the code:

from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait

driver = webdriver.Firefox()
driver.get("https://www.flashscore.com/football/albania/")
try:
    WebDriverWait(driver, 100).until(
            lambda s: s.execute_script("return jQuery.active == 0"))
    print(driver.page_source)
finally:
    driver.quit()

Occasionally, this will print out source code for a flashscore page with a blank table (i.e. the driver does not wait for the scores to finish loading). I suspect that this is because some of the live scores on the page are dynamically loaded. Is there any way to improve my wait condition?


Solution

    1. There's an accept cookies button, so we have to click on that first.
    2. I am using Explicit waits, first presence of table and then visibility of it's main body.

    Code :

    driver.maximize_window()
    driver.implicitly_wait(30)
    wait = WebDriverWait(driver, 30)
    driver.get("https://www.flashscore.com/football/albania/")
    wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, "button#onetrust-accept-btn-handler"))).click()
    try:
        wait.until(EC.presence_of_element_located((By.ID, "live-table")))
        wait.until(EC.visibility_of_element_located((By.CSS_SELECTOR, "section.event")))
        print(driver.page_source)
    finally:
        driver.quit()
    

    Imports :

    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support import expected_conditions as EC
    

    Output is certainly to long, so I would be able to post it here because stackoverflow won't allow me to do so.