For webscraping purposes i use semaphore object to limit async request.i have another async process simply prints out total scraped object from website.I wonder does semaphore count all async process? or just process under of its context manager?
async def __get_functions(self):
some codes here...
semaphore = asyncio.Semaphore(5)
async with aiohttp.ClientSession() as session:
tasks = [self.add_data(session, li, semaphore) for li in li_items]
await asyncio.gather(*tasks)
async def add_data(self, s, li, semaphore):
some codes here ...
async with semaphore:
info = await self.__get_function_data(s, href)
self.data.append({"Name": name, **info})
self.count += 1
start point of program
#here as u can see there is other async functions also
async def __start_application(self):
await asyncio.gather(self.__print_total_count(), self.__get_functions())
#i will call this method from class object
def start(self):
asyncio.run(self.__start_application())
First I would like to correct your terminology. In an asyncio program, the execution unit is a Task, not a Process. The two words have very different meanings in Python and it is important to use the correct one.
When you use a semaphore in a with statement, the block (which you say is "under" the semaphore) contains neither a Task nor a Process but just some code. The semaphore counts calls to its two methods acquire
and release
. acquire
decrements the count and release
increments it. The with statement insures that those calls are always paired - when you execute the with statement, it calls acquire
; when you exit the with-block it calls release
. The point of a Semaphore is that the count is never less than zero: if you call acquire
when the count is already zero, it will suspend the task at the point. The task will not go forward until another task calls release
. That's pretty much the whole story for a Semaphore. That's all it does.
So in your case, if you have 6 tasks that all try to acquire the semaphore, five tasks will go through and begin to execute the code inside the with block. The sixth task will stop until one of the other five tasks leaves the with block.
https://docs.python.org/3/library/asyncio-sync.html?highlight=asyncio%20semaphore#asyncio.Semaphore