Search code examples
seleniumweb-scrapingxpathiframecss-selectors

Web-scraping dynamic website with user input using Selenium and Python


As A Swimmer, I am trying to pull times from a table that can be accessed after the User Inputs their name or other optional fields. The website dynamically generates this data. Below is my current code which does not factor in user inputs.

I am very confused about how selenium's automation works and how to find the right text field for it to read my results and for the rest of my code to extract the table.

Can anyone give some advice on how to proceed?

Any help is appreciated and thanks in advance.

This Is My Current Code:

from selenium import webdriver
from bs4 import BeautifulSoup
import pandas as pd

options = webdriver.ChromeOptions()
options.add_argument('--headless')
options.add_argument('--no-sandbox')
options.add_argument('--disable-dev-shm-usage')
site = 'https://www.swimming.org.nz/results.html'
wd = webdriver.Chrome( "C:\\Users\\joseph\\webscrape\\chromedriver.exe")
wd.get(site)
html = wd.page_source
df = pd.read_html(html)
df[1].to_csv('Results.csv') 

Solution

  • To start with you need to send a character sequence to the Swimmer field.

    To send a character sequence to the Swimmer field as the elements are within an iframe so you have to:

    • Induce WebDriverWait for the desired frame to be available and switch to it.

    • Induce WebDriverWait for the desired element to be clickable.

    • You can use either of the following Locator Strategies:

      • Using CSS_SELECTOR:

        driver.get("https://www.swimming.org.nz/results.html")
        WebDriverWait(driver, 20).until(EC.frame_to_be_available_and_switch_to_it((By.CSS_SELECTOR,"iframe#iframe")))
        WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "input[id^='x-MS_FIELD_MEMBER']"))).send_keys("Joseph Zhang")
        
      • Using XPATH:

        driver.get("https://www.swimming.org.nz/results.html")
        WebDriverWait(driver, 20).until(EC.frame_to_be_available_and_switch_to_it((By.XPATH,"//iframe[@id='iframe']")))
        WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//input[starts-with(@id, 'x-MS_FIELD_MEMBER')]"))).send_keys("Joseph Zhang")
        
    • Note : You have to add the following imports :

      from selenium.webdriver.support.ui import WebDriverWait
      from selenium.webdriver.common.by import By
      from selenium.webdriver.support import expected_conditions as EC
      
    • Browser Snapshot:

    swim


    References

    You can find a couple of relevant discussions in: