Search code examples
javamultithreadingsocketsconnection-poolingzeromq

ZeroMQ multithreading: create sockets on-demand or use sockets object pool?


I'm building a POC leveraging ZeroMQ N-to-N pub/sub model. From our app server, when a http request is serviced, if the thread pulls data from the database, it updates a local memcache instance with that data. To synchronize other memcache instances in the app server cluster, the request thread sends a message with the data using a ZMQ publisher...so the question is: What strategy is the most effective with respect to minimizing socket create/destory overhead when the application has many threads that depend on sockets for sending messages? Do we share a pool of sockets, do we create/destroy sockets per thread, etc?

Strategy 1 - Thread-managed Publisher Socket
In this approach, each thread, T1, T2, and T3, manages the lifecycle of a socket object (publisher) by creating it, making the connection, sending a message, and finally closing the socket. Based on this, it's certainly the safest approach, but we have concerns with respect to overhead when sockets are created, connected, and destroyed repeatedly; if the overhead negatively impacts performance, we'd like to avoid it.

enter image description here

Strategy 2 - Publisher Sockets Object Pool
In this approach, the parent process (app server) initializes a pool of ZMQ publishers on startup. When a thread needs a publisher, it gets one from the object pool, sends its message, then returns the publisher to the pool; the process of creating, connecting and destroying sockets is eliminated with respect to the thread using the publisher, but access to the pool is synchronized to avoid any two threads using the same publisher object at the same time, and this is where deadlocks and concurrency issues may arise.

We have not profiled either approach because wanted to do a litmus on SO test first. With respect to volume, our application in not publish "heavy", but there could be between 100-150 threads (per app server) at the same time with the need to publish a message.

ZMQ Publisher Object Pool

So, to reiterate: What strategy is the most effective with respect to minimizing overhead while emphasizing performance when the application has many threads that depend on publishers for sending messages?


Solution

  • You can't really ask a question about performance without providing real figures for your estimated throughput. Are we talking about 10 requests per second, 100, 1,000, 10K?

    If the HTTP server is really creating and destroying threads for each request, then creating 0MQ sockets repeatedly will stress the OS and depending on the volume of requests and your process limits, it'll work, or it'll run out of handles. You can test this trivially and thats a first step.

    Then, sharing a pool of sockets (what you mean by "ZMQ publisher") is nasty. People do this but sockets are not threadsafe so it means being very careful when you switch a socket to another thread.

    If there is a way to keep the threads persistent then each one can create its PUB socket if it needs to, and hold onto it as long as it exists. If not, then my first design would create/destroy sockets anyhow, but use inproc:// to send messages to a single permanent forwarder thread (a SUB-PUB proxy). I'd test this and then if it breaks, go for more exotic designs.

    In general it's better to make the simplest design and break it, than to over-think the design process (especially when starting out).