I have successfully used Playwright in python to get elements from a page. I now ran into to challenge of getting elements from a document embedded within an iframe. As an example, I used the w3schools page explaining the <option>
element, which displays the result in an iframe. I am trying to retrieve a handle for this <option>
element from the iframe.
The 'normal' way of getting the an element on the page with page.querySelector()
fails to get an elementHandle
, this just prints <class 'NoneType'>
:
with sync_playwright() as p:
for browser_type in [p.chromium, p.firefox, p.webkit]:
browser = browser_type.launch(headless=False)
page = browser.newPage()
page.goto('https://www.w3schools.com/tags/tryit.asp?filename=tryhtml_option')
element = page.querySelector('select')
print(type(element))
browser.close()
I tried explicitly getting a handle for the iframe first, but this yields the same result (<class 'NoneType'>
):
with sync_playwright() as p:
for browser_type in [p.chromium, p.firefox, p.webkit]:
browser = browser_type.launch(headless=False)
page = browser.newPage()
page.goto('https://www.w3schools.com/tags/tryit.asp?filename=tryhtml_option')
iframe = page.querySelector('iframe')
element = iframe.querySelector('select')
print(type(element))
browser.close()
How can I get content from within the iframe?
Turns out I was close, but to get the iframe correctly, I needed to call the contentFrame()
method.
Returns the content frame for element handles referencing iframe nodes, or
null
otherwise
Then, querySelector()
will return the respective elementHandle
just fine:
with sync_playwright() as p:
for browser_type in [p.chromium, p.firefox, p.webkit]:
browser = browser_type.launch(headless=False)
page = browser.newPage()
page.goto('https://www.w3schools.com/tags/tryit.asp?filename=tryhtml_option')
iframe = page.querySelector('iframe').contentFrame()
element = iframe.querySelector('select')
print(type(element))
print(element.innerHTML())
browser.close()
successfully prints
<class 'playwright.sync_api.ElementHandle'>
<option value="volvo">Volvo</option>
<option value="saab">Saab</option>
<option value="opel">Opel</option>
<option value="audi">Audi</option>
Note: if there are multiple iframes, you can just use an attribute when retrieving the handle. To get the iframe by its id
in the above example, e.g. use
iframe = page.querySelector('iframe[id=\"iframeResult\"]').contentFrame()