Search code examples
pythonprefect

Prefect Local Agent Troubleshooting


I am running a flow on a local agent in docker on EC2 (not ECS). Prefect Cloud is configured to provide a UI for monitoring. The flow executes every 5mins, and for about an hour or so, it does just fine. However, the flows eventually fall behind, before failing to executer entirely and I get the 'cannot find a heartbeat' error.

Is there a way to run the local agent continuously? Why does it suddenly stop?

I apologise for the simplicity of the question, but I am new to Prefect.

Cheers


Solution

  • When the local or docker agent is running within a container itself (rather than a local process), your flow runs end up deployed as containers, but not as individual containers, but rather within the agent container. You effectively have a single agent container spinning up new containers within itself (docker in docker), which may have many unintended consequences such as issues with scale and resource utilization.

    To solve this, I would recommend running the local agent as a local process monitored by a supervisord. This documentation page provides more information.

    If you want more environment isolation for this agent process, you can run it within a virtual environment.

    To learn more about flow's heartbeat, check out this page and this one for running local or docker agent in a container.