Search code examples
pythoniframewebautomationplaywrightplaywright-python

In Playwright for Python, how do I retrieve a handle for elements from within an frame (iframe)?


I have successfully used Playwright in python to get elements from a page. I now ran into to challenge of getting elements from a document embedded within an iframe. As an example, I used the w3schools page explaining the <option> element, which displays the result in an iframe. I am trying to retrieve a handle for this <option> element from the iframe.

The 'normal' way of getting the an element on the page with page.querySelector() fails to get an elementHandle, this just prints <class 'NoneType'>:

with sync_playwright() as p:
    for browser_type in [p.chromium, p.firefox, p.webkit]:
        browser = browser_type.launch(headless=False)
        page = browser.newPage()
        page.goto('https://www.w3schools.com/tags/tryit.asp?filename=tryhtml_option')
        element = page.querySelector('select')
        print(type(element))
        browser.close()

I tried explicitly getting a handle for the iframe first, but this yields the same result (<class 'NoneType'>):

with sync_playwright() as p:
    for browser_type in [p.chromium, p.firefox, p.webkit]:
        browser = browser_type.launch(headless=False)
        page = browser.newPage()
        page.goto('https://www.w3schools.com/tags/tryit.asp?filename=tryhtml_option')      
        iframe = page.querySelector('iframe')
        element = iframe.querySelector('select')
        print(type(element))
        browser.close()

How can I get content from within the iframe?


Solution

  • Turns out I was close, but to get the iframe correctly, I needed to call the contentFrame() method.

    Returns the content frame for element handles referencing iframe nodes, or null otherwise

    Then, querySelector() will return the respective elementHandle just fine:

    with sync_playwright() as p:
        for browser_type in [p.chromium, p.firefox, p.webkit]:
            browser = browser_type.launch(headless=False)
            page = browser.newPage()
            page.goto('https://www.w3schools.com/tags/tryit.asp?filename=tryhtml_option')
            iframe = page.querySelector('iframe').contentFrame()
            element = iframe.querySelector('select')
            print(type(element))
            print(element.innerHTML())
            browser.close()
    

    successfully prints

    <class 'playwright.sync_api.ElementHandle'>
    
      <option value="volvo">Volvo</option>
      <option value="saab">Saab</option>
      <option value="opel">Opel</option>
      <option value="audi">Audi</option>
    

    Note: if there are multiple iframes, you can just use an attribute when retrieving the handle. To get the iframe by its id in the above example, e.g. use

    iframe = page.querySelector('iframe[id=\"iframeResult\"]').contentFrame()