Search code examples
node.jshttpconcurrencyc10k

Does slow disk I/O degrade the performance of rest of the Node.js application?


I'm in a small team developing a single-page application that relies heavily on low latency queries over WebSockets. The back-end runs on Node.js + Redis. It needs to support hundreds to thousands of simultaneous connections and requests need to be served under 50 - 100 ms (under good network conditions on the client side). We're fairly happy with our initial implementation of this part of the server, it is performing as expected.

We also need to serve lots of static files over HTTP. These requests are not time sensitive. Because of the large storage requirements, we would like to opt for an array of HDDs instead of SSDs for cost reasons.

Is there a risk of slow disk I/O degrading the performance of the rest of the Node.js application (the WebSocket part only using in-memory database) or is it going to strictly affect the HTTP / static file serving part of the server? As far as my understanding goes, Node.js with it's asynchronous nature would be well suited for this kind of situation because it would allow the WebSockets module to process queries normally while the HTTP module is waiting the disks to read/write?

Perhaps the large amount of "waiting to be served" HTTP requests can clog the server in some way (after all, they need to be stored somewhere and it's probably not free to poll if the read/write is available either) and we need to consider using either a separate Node.js process for serving static files or even a separate dedicated server altogether?

I can think of the following things:

  • the "waiting to be served" HTTP request are going to be using up the limited amount of concurrent TCP connections available
  • the "waiting to be served" HTTP requests are also going to use up some RAM
  • system files should probably reside on a disk that is not going to be busy serving the static files

We can't test this scenario out in the real world just yet so I would be very thankful to hear from anyone with similar experiences. This could require us to rethink out architecture and that's something we would preferably discover sooner than later.


Solution

  • Short answer: Yes There is a massive risk:p

    https://nodejs.org/api/fs.html#fs_threadpool_usage

    Threadpool Usage#

    All file system APIs except fs.FSWatcher() and those that are explicitly synchronous use libuv's threadpool, which can have surprising and negative performance implications for some applications. See the UV_THREADPOOL_SIZE documentation for more information.

    http://docs.libuv.org/en/v1.x/design.html#file-i-o

    UV Threadpool is backed by system threads so the scaling profile will be very different than the rest of the application. IMO this is not neciessarly more risky just ... different and hard data would help to see the scaling profile of your application.


    I think general best practice is to offload serving of static files if at all possible. Having a proxy in front of your application which is able to quickly serve static files (nginx) might help (not because it fixes the core issue of blocking file system reads but because it may offer more uniform / predictable performance or scaling, since it only has to manage serving files, while your application would need to context switch between users AND serving files. I personally would research your recommendations of using separate process or CDN to try and fully remove the serving of static files from your application.