Search code examples
pythonwebautomationplaywrightplaywright-python

Receiving response from python as an array - PHP


I have PHP code, and I'm calling python script which takes the URL, go the web URL, takes the JSON page and then send back JSON page to the PHP code, but the issue is that i got that JSON in the array and not in the correct JSON format, anu help?

python code:

import json
import sys
import bs4
import urllib.parse
link = urllib.parse.unquote(sys.argv[1])
from playwright.sync_api import sync_playwright
with sync_playwright() as p:
    browser = p.chromium.launch()
    context = browser.new_context(user_agent='Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.114 Safari/537.36')
    page = context.new_page()
    cookie_file = open('./cookies.json')
    cookies = json.load(cookie_file)
    context.add_cookies(cookies)
    try:
        page.goto(link)
        page.wait_for_timeout(10000)
        print(page.content())
        page.close()
        context.close()
        browser.close()      
    except Exception as e:
        print("Error in playwright script.")
        page.close()
        context.close()
        browser.close()      

Solution

  • The content function relies on document.documentElement.outerHTML, so you might get a formatted value. If the request returns a JSON you could grab the text from the response that goto returns:

    response = page.goto("https://raw.githubusercontent.com/corysimmons/colors.json/master/colors.json")
    jsonContent = response.text()
    jsonResult = json.loads(jsonContent)
    print(jsonResult)
    

    If there is some processing in the middle, you could ask for the inner_text of the :root element:

    jsonContent = page.inner_text(':root')
    jsonResult = json.loads(jsonContent)
    print(jsonResult)