I have the following class in a FastAPI application:
import asyncio
import logging
from multiprocessing import Lock, Process
from .production_status import Job as ProductionStatusJob
class JobScheduler:
loop = None
logger = logging.getLogger("job_scheduler")
process_lock = Lock()
JOBS = [ProductionStatusJob]
@classmethod
def start(cls) -> None:
cls.logger.info("Starting Up (1/2)")
Process(target=cls._loop).start()
@classmethod
def _loop(cls) -> None:
cls.loop = asyncio.get_event_loop()
cls.loop.create_task(cls._run())
cls.logger.info("Startup Complete (2/2)")
cls.loop.run_forever()
cls.loop.close()
@classmethod
async def _run(cls) -> None:
while True:
...
@classmethod
async def stop(cls) -> None:
cls.logger.info("Shutting Down (1/2)")
with cls.process_lock:
cls.loop.stop() # <= This Line
cls.loop.close()
cls.logger.info("Shutdown Complete (2/2)")
cls.loop = None
On the startup
and shutdown
events of the FastAPI application, the JobScheduler.start()
and JobScheduler.stop()
methods will be called.
The start
method works smoothly, but in stop
I get an error:
File "/backend/app/main.py", line 146, in stop_job_scheduler
2023-08-16 11:46:27 await job_scheduler.stop()
2023-08-16 11:46:27 File "/backend/app/jobs/__init__.py", line 59, in stop
2023-08-16 11:46:27 cls.loop.stop()
2023-08-16 11:46:27 AttributeError: 'NoneType' object has no attribute 'stop'
But cls.loop
is set during the _loop
method (which is executed at the end of start
) - so why does cls.loop
still have its initial None
value when the stop
method is called?
Are there any better approaches to clean up the background processes when the FastAPI application calls shutdown
?
multiprocessing
in Python is funny. It's more powerful than multithreading but also comes with some caveats. The first of those is that you're actually running a different Python interpreter entirely. That means that global variables and the like are going to get a new copy for each process you run.
Depending on your operating system and choice of start method, your processes may be forked or spawned. A spawned process will start anew, as though a new Python program was just spun up. A forked process will get all of the current values of variables from the source process, but it'll still copy all of those variables. Future changes to either process will not affect the other, without explicit synchronization using one of the multiprocessing
helpers.
You can use a Manager
to synchronize data between processes explicitly. This acts sort of like a local server that both processes connect to. For more explicitly pub-sub data, you can also use a Queue
to pass information from one process to another.