java multithreading concurrency client-server nio

Multithreaded server in Java

The server is essentially a queue of items, while the clients act as producers and/or consumers of such items.

The server must:

Listen for put / take requests and process them accordingly - this typically won't take too long, it consists of:
1. Parsing a short string;
2. An HashMap.get;
3. Acquiring a lock;
4. A PriorityQueue.poll or PriorityQueue.offer;
Notify every client of all item activity, as soon as possible, so that every client has a real-time view of what's going on.

The easiest way of setting this up, is by having a thread accepting clients, and then creating two threads for each client:

One that handles the InputStream, which blocks waiting for requests;
And another to handle the OutputStream, which listens for events on the queue, and sends the information to the client.

Surely this isn't scalable, and it seems wasteful to have two threads for each client.

I also thought about using a single thread, which would

Set a socket timeout for read of about 1 second;
Proceed to send every new event to the client, if the read times out, or after processing the request;
Loop these two actions.

However, polling for requests and events is also wasteful.

Another approach would use a thread pool, and put each of the two above actions in their respective Runnable. These runnables would then queue each other in the Executor. This seems just as wasteful, if not more.

I've been reading some questions, and I'm now curious about NIO, since non-blocking operations and an event-driven server seems to be the right approach.

Is any of the above designs suitable for this task, or should I tackle it with NIO?

In terms of numbers, this is more of an exercise than a real system, so it doesn't have to deal with thousands of clients, but, ideally, it should be able to perform and scale well.

Solution

Two threads per client is definitely not scalable.

If you have M cores on the server, you can't actually do better than having M threads running. Anything higher exposes you to thrashing, and will reduce the number of executed operations per second.

A better design partitions the available cores into U updaters and L listeners, where U + L == M.

Each client is assigned (ideally with load balancing, but that's an embellishment) to an updater thread. Each update event is multicast to all the updater threads, which each then update all their assigned clients. A client at the end of an updater's list is updated later than one at the beginning, but there is no help for it: you only have so much hardware.

Similarly, each client is assigned to a listener thread, which handles more than one listener. Client input is dumped into a FIFO queue, and processed by the listener thread as soon as it gets around to it.

Each thread can then stay active and in memory, while client data is moved through the system. The design degrades gracefully, in that too many clients means all updates get slow as a linear function of the number of clients. The designs you propose will degrade faster than that.

Modern (e.g., later than, say 2002) web servers bury this all deep in the implementation, so developers don't need to manage it. But it's still a useful exercise.