From the documentation I have this example I've tested and works..
from requests_html import AsyncHTMLSession
asession = AsyncHTMLSession()
async def get_pythonorg():
r = await asession.get('https://python.org/')
async def get_reddit():
r = await asession.get('https://reddit.com/')
async def get_google():
r = await asession.get('https://google.com/')
result = asession.run(get_pythonorg, get_reddit, get_google)
But what if my urls are variable? I'd like to do this..
from requests_html import AsyncHTMLSession
urls = ('https://python.org/', 'https://reddit.com/', 'https://google.com/')
asession = AsyncHTMLSession()
async def get_url(url):
r = await asession.get(url)
tasks = []
for url in urls:
tasks.append(get_url(url=url))
result = asession.run(*tasks)
but I get..
Traceback (most recent call last): File "./test.py", line 17, in <module>
result = asession.run(*tasks) File "/home/deanresin/.local/lib/python3.7/site-packages/requests_html.py", line 772, in run
asyncio.ensure_future(coro()) for coro in coros File "/home/deanresin/.local/lib/python3.7/site-packages/requests_html.py", line 772, in <listcomp>
asyncio.ensure_future(coro()) for coro in coros TypeError: 'coroutine' object is not callable sys:1: RuntimeWarning: coroutine 'get_url' was never awaited
It is because you are passing coroutines objects and not coroutines functions.
You can do:
from requests_html import AsyncHTMLSession
urls = ('https://python.org/', 'https://reddit.com/', 'https://google.com/')
asession = AsyncHTMLSession()
async def get_url(url):
r = await asession.get(url)
# if you want async javascript rendered page:
await r.html.arender()
return r
all_responses = asession.run(*[lambda url=url: get_url(url) for url in urls])
The error is coming from result = asession.run(*tasks)
so let's see the source code of AsyncHTMLSession.run()
:
def run(self, *coros):
""" Pass in all the coroutines you want to run, it will wrap each one
in a task, run it and wait for the result. Return a list with all
results, this is returned in the same order coros are passed in. """
tasks = [
asyncio.ensure_future(coro()) for coro in coros
]
done, _ = self.loop.run_until_complete(asyncio.wait(tasks))
return [t.result() for t in done]
So in the following list comprehension you are normally passing a callable coroutine function and not coroutine object
tasks = [
asyncio.ensure_future(coro()) for coro in coros
]
But you then in your error you have for coro in coros TypeError: 'coroutine' object is not callable
.
So you are passing a list of coroutines objects and not coroutines functions.
Indeed when you are doing this:
tasks = []
for url in urls:
tasks.append(get_url(url=url))
You are making a list of coroutines objects by calling your coroutine function.
So in order to make a list of coroutines functions you can use lambda function like this:
[lambda url=url: get_url(url) for url in urls]
Note the url=url
in order to make the url parameter accessed when the lambda is defined.
More informations about this here.