Search code examples
pythonpython-requestsjinja2aiohttp

Python: How to get HTML text that has Jinja templates using requests or aiohttp?


I am using python, request or aiohttp method to get page, and BeautifulSoup4 for parsing webpage. Server HTML page uses jinja template, so when i get this page using requests or aiohttp, i get something like this:

<a href="/{{username}}" class=\'pr\'>

but if you open this page using browser, code looks like this:

<a href="/gavrilka" class=\'pr\'>

request code:

import requests
url = 'MY URL'
header = {"MY HEADERS"}
payload = {}
response = requests.request("GET", url, headers=headers, data = payload)
print(response.text.encode('utf8'))

aiohttp code:

import aiohttp
url = 'MY URL'
header = {"MY HEADERS"}
payload = {}
async with aiohttp.ClientSession() as session:
    async with session.get(base_url, headers=headers) as resp:
        data = await resp.text()
        print(data)
    await session.close()

How should i do to get correct page text?


Solution

  • Used selenium and phantomjs, and now it works.

    from selenium import webdriver
    from bs4 import BeautifulSoup
    
    url = "https://yourlink"
    
    driver = webdriver.PhantomJS() 
    driver.set_window_size(1024, 768)  # optional
    driver.get(url)
    page_source = driver.page_source
    soup = BeautifulSoup(page_source, 'lxml')