Search code examples
pythonselenium-webdriverxpath

XPath - how to select div that has a specific class, but only if it has a child div with a specific class inside iframe?


I'm scraping this page with Python and Selenium. Specifically, I'm trying to scrape all the job search results (divs with job information), and as you can see, they're in a div element with a class row. However, because they don't have a class specific for them alone, but just have a generic class row, I can't just get them by that alone.

This is what I tried, in an attempt to get an element with a class row, which has a child with a class header. But I think I'm using the contains wrong, and I have no idea how to fix it:

wait.until(EC.visibility_of_element_located((By.XPATH, "(//div[contains(@class,'row') and (contains(@class, 'header'))])")))

When I use the code above, I get the TimeoutException error, so I'm assuming it's looking for a div with both of those classes and failing to find it. Which is not what I had in mind.

How do I adjust the contains above to get what I need? Is it even possible to use this approach? Thanks!


Solution

  • Your table is placed in iframe.

    To get an element from iframe you should switch to it's context first.

    Frame has unique id icims_content_iframe

    Reference

    from selenium import webdriver
    from selenium.webdriver.chrome.options import Options
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.support import expected_conditions as EC
    
    options = Options()
    driver = webdriver.Chrome(options=options)
    driver.maximize_window()
    wait = WebDriverWait(driver, 10)
    
    driver.get("https://bbrauncareers-bbraun.icims.com/jobs/search?ss=1&searchRelation=keyword_all&mobile=false&width=1168&height=500&bga=true&needsRedirect=false&jan1offset=120&jun1offset=180")
    frame = wait.until(EC.presence_of_element_located((By.ID, 'icims_content_iframe')))
    driver.switch_to.frame(frame)
    
    table_rows = wait.until(
        EC.presence_of_all_elements_located((By.CSS_SELECTOR,"[class*=JobsTable] .row"))
    )
    
    for row in table_rows:
        print(row.find_element(By.CSS_SELECTOR, '.title h2').text)
    
    driver.switch_to.default_content()