I have PHP code, and I'm calling python script which takes the URL, go the web URL, takes the JSON page and then send back JSON page to the PHP code, but the issue is that i got that JSON in the array and not in the correct JSON format, anu help?
python code:
import json
import sys
import bs4
import urllib.parse
link = urllib.parse.unquote(sys.argv[1])
from playwright.sync_api import sync_playwright
with sync_playwright() as p:
browser = p.chromium.launch()
context = browser.new_context(user_agent='Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.114 Safari/537.36')
page = context.new_page()
cookie_file = open('./cookies.json')
cookies = json.load(cookie_file)
context.add_cookies(cookies)
try:
page.goto(link)
page.wait_for_timeout(10000)
print(page.content())
page.close()
context.close()
browser.close()
except Exception as e:
print("Error in playwright script.")
page.close()
context.close()
browser.close()
The content
function relies on document.documentElement.outerHTML
, so you might get a formatted value.
If the request returns a JSON
you could grab the text from the response
that goto
returns:
response = page.goto("https://raw.githubusercontent.com/corysimmons/colors.json/master/colors.json")
jsonContent = response.text()
jsonResult = json.loads(jsonContent)
print(jsonResult)
If there is some processing in the middle, you could ask for the inner_text
of the :root
element:
jsonContent = page.inner_text(':root')
jsonResult = json.loads(jsonContent)
print(jsonResult)