Search code examples
pythonpython-3.xwebautomationplaywrightplaywright-python

print page source in python playwright


I have PHP script, and Im calling python function with this code with URL parameter:

import json
import sys
import urllib.parse
link = urllib.parse.unquote(sys.argv[1])
from playwright.sync_api import sync_playwright
with sync_playwright() as p:
    browser = p.chromium.launch()
    context = browser.new_context(user_agent='Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.114 Safari/537.36')
    page = context.new_page()
    cookie_file = open('./cookies.json')
    cookies = json.load(cookie_file)
    print(cookies)
    context.add_cookies(cookies)
    page.goto(link)
    try:
        page.wait_for_timeout(10000)
        print(page.innerHTML("*"))
        page.close()
        context.close()
        browser.close()      
    except Exception as e:
        print("Error in playwright script.")
        page.close()
        context.close()
        browser.close()     

However, when I want to print the page source after I visit the page, I receive

Error in playwright script.

because code that I tried doesnt works:

print(page.innerHTML("*"))

any help?


Solution

  • To get the full HTML content of the page you can use page.content().