Search code examples
jquerypython-3.xseleniumweb-scrapingiframe

Selenium is not finding the iframe element


This is the website I want to scrape https://anime-hayai.com/play/30148/%E0%B8%95%E0%B8%AD%E0%B8%99%E0%B8%97%E0%B8%B5%E0%B9%88-1-hd.html

I want to scrape scr='....' data from video but it's returning empty string.

What i have tried

from selenium import webdriver
from selenium.webdriver.support import expected_conditions as EC
from webdriver_manager.chrome import ChromeDriverManager

driver = webdriver.Chrome(ChromeDriverManager().install())
driver.get('https://anime-hayai.com/play/30148/%E0%B8%95%E0%B8%AD%E0%B8%99%E0%B8%97%E0%B8%B5%E0%B9%88-1-hd.html')

video = driver.find_element_by_xpath("//*[@id='player']/div[2]/div[4]/video").text
print(video)

It's returning '' empty string.

Is I am doing something wrong? enter image description here Expected result from video scr

'https://stream.anime-hayai.com/videoplayback?id=6o7ov8-mxpu0aKZQYtLDpMuepaDas4tllm2jqqGUcqDE1c-j0pzahY1pxGuiaJhSteDG5NLdkaaVcotklmRye55ecaGWoJqio46hcs2XyKSmaKZQYqCXoovq'

Solution

  • There are nested iframe, so first you have to

    1. Switch to first iframe
    2. Switch to child iframe

    Also, you would need to scroll all the way down.

    Code :

    chromedriver_autoinstaller.install()
    driver_path = r'C:\\Users\\panabh02\\OneDrive - CSG Systems Inc\\Desktop\\Automation\\chromedriver.exe'
    
    driver = webdriver.Chrome(driver_path)
    driver.maximize_window()
    
    wait = WebDriverWait(driver, 30)
    
    
    driver.get("https://anime-hayai.com/play/30148/%E0%B8%95%E0%B8%AD%E0%B8%99%E0%B8%97%E0%B8%B5%E0%B9%88-1-hd.html")
    driver.execute_script("var scrollingElement = (document.scrollingElement || document.body);scrollingElement.scrollTop = scrollingElement.scrollHeight;")
    wait.until(EC.frame_to_be_available_and_switch_to_it((By.CSS_SELECTOR, "iframe[name='video_player']")))
    wait.until(EC.frame_to_be_available_and_switch_to_it((By.CSS_SELECTOR, "iframe[class='embed-responsive-item']")))
    video_url = wait.until(EC.visibility_of_element_located((By.XPATH, "//div[@class='jw-media jw-reset']//*"))).get_attribute('src')
    print(video_url)
    

    Imports :

    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support import expected_conditions as EC
    

    Output:

    https://stream.anime-hayai.com/videoplayback?id=6o7ov8-mxpu0aKZQYtLDpMuepaDas4tllm2jqqGUcqDE1c-j0pzahY1pxGuiaJhSteDG5NLdkaaVcotklmRye55ecaGWoJqio46hcs2XyKSmaKZQYqCXoovq