Search code examples
javajavascriptperformanceioevent-driven

Asynchronous I/O - Java


I have been searching for details on the advantages of asynchronous I/O in Java, particularly from the application stack designing.

I encountered numerous examples of event driven servers like Node.js, Tornedo etc.

What I failed to understand is why would someone having an entire application stack in Java EE with the JBoss or Weblogic app server migrate to an event driven architecture.

Even these servers support non-blocking I/O. Yes, they are allotting a thread for each request, but with a threadpool in place, wouldn't the resources be well within good performance parameters?

Kindly provide me some inputs along the following lines.

  1. Why a traditional Java EE architecture with Apache-Tomcat/JBoss/Weblogic considers a move to an event driven architecture.
  2. Would the event driven architecture be helpful to provide a device-agnostic website/application.
  3. When designing an application on the cloud, would we go for an asynchronous I/O.
  4. Is the event-driven architecture performance better than the traditional Java EE architecture or is it a myth.

Solution

  • One of the key concepts that you have mentioned is:

    Yes, they are allotting a thread for each request

    It's been shown time and again that having a thread per request with an IO bound app will eventually exhaust your thread-pool when your goal is to support a large number of concurrent users. As it turns out, the frameworks you are talking about like Node.js, Tornado, etc. excel at handling a large number of concurrent users where your application is most likely to be just waiting for something to occur and doesn't do any CPU bound tasks at all. In other words, these tools are great for building real-time apps like online games, chatrooms, logging systems, notification systems where the primary goal is quickly coordinating small message passing, with many users, as fast as possible.

    In fact, these tools go great with writing websocket based applications because it's really about offering a real-time or near real-time experience to the user.

    While it's true that many companies are utilizing these platforms from the get go, I think it's more common for companies with traditional stacks to use the event-driven tools as more of supplementary to their system. When you go with something like node.js or Tornado, you may find yourself giving up a lot of built-in software that you rely upon in favor of having to roll your own api's and drivers. node.js has been around for awhile now, and there is actually a lot of great support for hooking into databases, nosql platforms and build-systems but it took awhile for it to get there.

    As an experiment, try to write a simple tcp chat application that uses one thread per request and see how many users you can support. Eventually, you will hit a limit with how many OS threads you can spin up, which are indeed expensive.

    Then see how far you can get with node.js using just one thread, its default thread. You will find that you are able to support an extremely large number of concurrent requests per second. It's been known to scale in the millions because it's not limited by threads, it's only limited by the memory, number of file descriptors and cpu at that point.

    To answer your questions the best I can:

    1. I don't think it's feasible to simply ditch your entire platform just because you hear how great node.js and event-driven architectures are. You really have to ask yourself, if you have the need to build an IO bound highly concurrent application. If so, why not just use it to supplement your existing stack?
    2. I'm not sure on your second question, what do you mean by device?
    3. You can build a great application in the cloud based on traditional tools just as much as using event-driven architectures. The fact that it may be a "cloud" application really has nothing to do with choosing the platform.
    4. I would say it's more about scale than performance. You may find that a node.js app runs slower or faster than a java app running the same code. But what the node.js is able to do is allow a much higher rate of throughput because it's not going to hit that thread limit I mentioned. And this also implies that you have built a proper event-driven application where you do not block. If you block, you take down the whole system!