Search code examples
pythonhtmlcssselenium-webdriverweb-scraping

Scraping data after clicking a checkbox in python


I am trying to scrape some links from this career website. Problem is, before scraping the links, I need to select a particular brand (Say Sierra). Question is how do I click on dropdowns and checkboxes to select a brand.

I tried to do following:

Step 1: Click on "Brand" first to enable the checkbox of brands. (Once we click on Brand, the checkboxes becomes available on normal webpage).

Step 2: Select the brand by clicking the checkbox. However, I can't find the checkbox using css or xpath.

Step 3: Once the checkbox is selected, we will get links of many job postings. but only first 10 are shown on each page. I need to navigate and find the links of all job postings. (11-20, 21-30 etc.)

Code for step 1: I tried following code to click on Brand, but I am not sure if my code is able to do that (I don't know how to verify if following is working).

brand_dropdown_button = driver.find_element(By.XPATH, "//button[contains(text(), 'Brand')]")
brand_dropdown_button.click()

Code for step 2: (I tried following, which does not work).

checkbox = driver.find_element(By.XPATH, "//input[@value='Sierra']") # no such elements 
checkbox = driver.find_element(By.CSS_SELECTOR, "input[value='Sierra']") # no such element

I tried giving it time, but even after waiting for 10 seconds, my code can't find any checkboxes

Another problem is that the url does not change when we click "brand" to enable checkboxes, or when we select a particular brand. That is why I can't verify anything manually.


Solution

  • Here's some code that clicks the Brand accordion and then clicks one of the Brands by name provided in the brand variable.

    from selenium import webdriver
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support.wait import WebDriverWait
    from selenium.webdriver.support import expected_conditions as EC
    
    url = "https://jobs.tjx.com/global/en/search-results?rk=l-retail-jobs"
    driver = webdriver.Chrome()
    driver.maximize_window()
    driver.get(url)
    
    brand = "HomeGoods"
    wait = WebDriverWait(driver, 10)
    wait.until(EC.element_to_be_clickable((By.ID, "BrandAccordion"))).click()
    wait.until(EC.element_to_be_clickable((By.XPATH, f"//div[@id='BrandBody']//span[text()='{brand}']"))).click()
    

    The INPUT that you were trying to click is just a "backing" element... it's not actually visible or clickable. I had to resort to clicking the SPAN next to it that contained the Brand name.