Search code examples
pythonselenium-webdriverweb-scrapingselenium-chromedriver

Why can't selenium find this web element?


I can't seem to get selenium in python to find the "Allow all" button for the cookies pop up on this website:

https://www.dice.com/jobs

and it can't find the login button either

driver = webdriver.Chrome(service=Service(ChromeDriverManager().install()), options=options)
driver.get("https://www.dice.com/jobs")
WebDriverWait(driver, timeout=10).until(EC.presence_of_element_located((By.ID, "cmpwelcomebtnyes")))
driver.find_element(By.ID, "cmpwelcomebtnyes").click()

This just comes back with TimeoutException


Solution

  • The problem is that the element you are looking for is in a shadow-root. Selenium needs special instructions to be able to see inside it... similar to an IFRAME, if you are familiar with that. When you inspect the page, it looks like

    <body>
      <div id="cmpwrapper" class="cmpwrapper">
        #shadow-root (open)
          <div id="cmpbox" ...>
    

    We need to grab the element that contains the shadow-root #cmpwrapper, grab the shadow_root, and then perform actions from there.

    You can find working code below

    from selenium import webdriver
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support import expected_conditions as EC
    from selenium.webdriver.support.wait import WebDriverWait
    
    driver = webdriver.Chrome()
    driver.maximize_window()
    driver.get('https://www.dice.com/jobs')
    
    wait = WebDriverWait(driver, 10)
    shadow_host = wait.until(EC.presence_of_element_located((By.ID, 'cmpwrapper')))
    shadow_root = shadow_host.shadow_root
    shadow_root.find_element(By.ID, 'cmpwelcomebtnyes').click()
    

    A couple additional notes:

    1. As of Selenium 4.6+, you no longer need to use a separate driver manager. Selenium now has a built in Selenium Manager. The code above reflects the changes.

    2. When you use a WebDriverWait that waits for an element, the .until() method returns the element so you can assign the return to a variable, etc. and use it so you don't have to scrape the page twice. See the example below

      WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.ID, "id")))
      driver.find_element(By.ID, "id").click()
      

      can be simplified to

      WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.ID, "id"))).click()
      

      or for variable assignment,

      e = WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.ID, "id")))
      e.click()