Search code examples
node.jsdatabaselibuv

Writing a database with async i/o


I recently came across libuv, the low level library that lets nodejs do its async magic. This got me thinking. I would like clarifications along these lines:

  1. Nodejs has async i/o calls. However if I'm calling APIs to a (remote) database the actual read/write into the db will be synchronous but node does not have to wait for that. Its possible to make the db itself write to the disk in async maybe? Are there any databases that uses libuv for actual async i/o?

  2. Javascript is famously single threaded. I understand that the nodejs runtime need not be - I can fire up 4 instances if I have 4 cpu cores. But if I use libuv to write yet another web framework in a language that support threads, wouldn't it have all the goodness of async i/o AND multithreading? Does something like that already exist?


Solution

  • You're mixing up two concepts. The fact that you, while doing a query to a service, can wait (via epoll/kpoll/libuv...) asynchronously, doesn't mean that your query is non-blocking on the other side, and vice versa. It also does not mean that, while in your event loop, things "feel" async, that they truly are.

    Let's go back to what an event loop is and how nodeJS does its magic. I feel it's a good start to the story.

    The visible part of an event loop is a change in the way code is written - from mostly synchronous to mostly asynchronous. The invisible part is that this asynchronous code is thrown as much as possible on an event loop, which, in the background, checks for things to do - IO, timers, etc. It isn't a new idea and it does its job (providing concurrency) really well.

    libuv's documentation is actually very descriptive on this. Over there is a description of the design choices they took, and from there came this flowchart:

    The libuv flow

    Note that nowhere do they state that they have made anything truly async - because they haven't. The underlying system calls remain synchronous. It just feels like it isn't. That is the key take-away.

    Regarding disk I/O on databases, I gave a talk in the Hague a while back about this, and, quite frankly, most of the crucial I/O is blocking. For instance, you can't go "Hey, I'll update the disk snapshot and append-only txlog at the same time!" - because, if one of them fails, you've got a serious, serious rollback issue and possibly unknown state.

    Regarding question 2, I'd give code examples but I'm not sure what languages you are familiar with. The bottom-line is, the moment something crosses a thread boundary, things become hell. A very naive example would be this - suppose your event loop has two timers as follows:

    • Timer 1, firing every 0.5s, increments a given state variable A
    • Timer 2, firing every time somebody provides user input, divides the state variable by 2.

    Suppose you're running on a single-thread. Even though your event loop feels asynchronous, it is completely sequential - timer 1 will never run while timer 2 is running.

    Now add in a second thread, make timer 2 run from it. Without a guard in place, there is a fair possibility that something, somewhere, will go very wrong.

    In order to be able to divide something by 2 the naive way (without taking advantage of CPU instructions dedicated to this kind of stuff), one has to retrieve the variable, divide it by 2, then set it back in memory.

    The same goes, incrementing is also a three-stage process (again, taking the naive approach).

    Once those two timers clash, you can get some crazy race conditions like the following:

    THREAD 1          | THREAD 2
       <- A=1         |
     Local:A=1+1=2    |  <- A=1
                      |  Local: A=1*2=2
         A=2 ->       |  A=2 ->
    

    Thread 2 started running halfway through thread 1's computation, retrieved the wrong state variable value (as thread 1 had not updated the variable yet), and multiplied it by 2. You should have had 3, but in reality you ended up with 2.

    To protect against this, there are a whole bunch of methods and tools. Most processor architectures these days have atomic instructions (Intel, for instance), and developers can leverage those if they know where they need them. You can have a whole bunch of tools on top of these - mutexes, read/write locks, semaphores, etc... to reduce or remove those issues, at a cost, and when you know where you'll need them.

    Needless to say, it is far from trivial to generalize this.