Search code examples
pythonrestzeromqhypercorn

zeromq failing after running for a bit


I currently run a process as well as an API. These two need to communicate, more specifically the API to the process. For this I am using zeromq's python library. My code is as follows:

server.py:

from zmq import POLLIN, ROUTER
from zmq.asyncio import Context, Poller
from zmq.auth.asyncio import AsyncioAuthenticator

    async def start(self):
        """Starts the zmq server asyncronously and handles incoming requests"""
        context = Context()

        auth = AsyncioAuthenticator(context)
        auth.start()
        auth.configure_plain(domain="*", passwords={"user": IPC_TOKEN})
        auth.allow("127.0.0.1")

        socket = context.socket(ROUTER)
        socket.plain_server = True
        socket.bind("tcp://*:5555")

        poller = Poller()
        poller.register(socket, POLLIN)

        while True:
            socks = dict(await poller.poll())

            if socket in socks and socks[socket] == POLLIN:
                message = await socket.recv_multipart()
                identity, request = message
                decoded = loads(request.decode())
                res = await getattr(self, decoded["route"])(decoded["data"])
                if res:
                    await socket.send_multipart([identity, dumps(res).encode()])
                else:
                    await socket.send_multipart([identity, b'{"status":"ok"}'])

client.py:

from zmq import DEALER, POLLIN
from zmq.asyncio import Context, Poller

async def make_request(route: str, data: dict) -> dict:
    context = Context.instance()
    socket = context.socket(DEALER)
    socket.identity = uuid.uuid4().hex.encode('utf-8')
    socket.plain_username = b"user"
    socket.plain_password = IPC_TOKEN.encode("UTF-8")
    socket.connect("tcp://localhost:5555")

    request = json.dumps({"route": route, "data": data}).encode('utf-8')
    socket.send(request)

    poller = Poller()
    poller.register(socket, POLLIN)

    while True:
        events = dict(await poller.poll())
        if socket in events and events[socket] == POLLIN:
            multipart = json.loads((await socket.recv_multipart())[0].decode())
            socket.close()
            context.term()
            return multipart

This works perfectly fine for the first few requests but after a certain number of requests this code fails silently (without error) and I get no IPC responses. When I curl the API that uses this code as a client to my second process the request times out. Timeout

I assume this is because the processes are not properly closed or clogged up but I am not sure how to fix this. Here I ran an lsof command on my server after this code ran for a bit and stopped working: lsof output

How can I prevent these connections from timing out after a while?


Solution

  • So as it turns out this is not a zeromq issue. I requested the zeromq TCP endpoint when the API started failing and it still returned data just fine. It is instead an issue with hypercorn which I have yet to figure out. Hard to spot where an issue is coming from when you have a lot of moving parts! I hope this at least helps someone else with a similar issue.