Search code examples
multithreadingrediskeydb

KeyDB and multithreading: Looks like no multithreading is going on?


I am experimenting with KeyDB to see if and how much performance improvements can be gained, as there are definitely bottlenecks with Redis single-threaded query model. So I found KeyDB, and they say they use "real" multithreading to do parallel queries to the db, unlike Redis that only has IO multithreading and not the actual queries.

From the documentation link above:

Unlike Redis6 and Elasticache, KeyDB multithreads several aspects including placing the event loop on multiple threads, with network IO, and query parsing done concurrently.

My simple test setup:

  1. First, I install KeyDB on Ubuntu (WSL2) and get it running
    1. I note that when starting KeyDb, 2 threads are active: Thread 0 alive. Thread 1 alive.
  2. I modify the keydb.conf to disable some saving/persisting, but most importantly, I change the server-threads option to 2: server-threads 2. note: I have also tried without the use of the config file and just add the cmd flag --server-threads 2 and also setting threads to 4, no difference.
  3. Then I run a simple script:
    1. Create 1M entries into a hash with some simple JSON objects
    2. Create a simple console app that uses two threads; one thread starts doing very simple SETs (SET key1 1) or GETs (GET key1 1) in a loop, and another thread that does a "fetch all" from the hash (HGETALL testhash). The second thread waits 1 sec before it starts its "long query".

GitHub repo (using StackExchange.Redis lib) can be found here.

What I expect:

I expect that the simple quick SET/GETs takes approx the same time every time, without any delays or throttling due to a block in KeyDB while the long query is running.

What happens:

The simple quick SET/GETs are blocked/delayed for around 500-700 ms while the long query is running, indicating that only one thread is being used and thus blocking other operations. This is in line with how Redis works, and what I wanted to avoid with KeyDB.

Log:

The "Starting long query" is when we do the HGETALL and almost immediately after, the simple SET is throttled and takes over 500ms, when it should take 0-1 ms, as can be seen before and after.

Using ServiceStack Redis client:

10:50:55.336    GetValueFromHashAsync took 1
10:50:55.367    GetValueFromHashAsync took 1
10:50:55.397    GetValueFromHashAsync took 0
10:50:55.416    Starting long query
10:50:56.191    GetValueFromHashAsync took 766 <-- THROTTLED! Delayed with what I think is the actual query time, not the IO part, so at this point, the line fetching data has not completed yet
10:50:56.228    GetValueFromHashAsync took 0
10:50:56.261    GetValueFromHashAsync took 1
....
....
10:51:00.592    GetValueFromHashAsync took 1
10:51:00.620    GetValueFromHashAsync took 1
10:51:00.651    GetValueFromHashAsync took 1
10:51:00.663    Long query done in 5244        <-- The long query returns here, line is completed, total time was about 5 seconds, while the block was about 0.7 seconds

I have also tested to do a Get from hash instead of a SET, same thing.

Using StackExchange.Redis: In the GitHub reproducable project, found here, I am instead using StackExchange.Redis instead of ServiceStack, and I get a different (worse!) behaviour:

11:27:12.084    HashGetAsync took 0
11:27:12.115    HashGetAsync took 0
11:27:12.146    HashGetAsync took 0
11:27:12.177    HashGetAsync took 1
11:27:12.183    Starting long query
11:27:14.877    Long query done in 2692
11:27:14.893    HashGetAsync took 2686      <-- THROTTLED! This time the other thread is delayed the entire time, query + IO.
11:27:14.929    HashGetAsync took 0
11:27:14.960    HashGetAsync took 0
11:27:14.992    HashGetAsync took 0
11:27:15.023    HashGetAsync took 0
11:27:15.053    HashGetAsync took 0

Conclusion

Regardless of what client library I use, KeyDB is throttling requests/queries while a "long query" is running, even though I have 2 threads. It does not matter if I start KeyDB with 4 threads, same behaviour.

I don't know why StackExchange behaves differently from ServiceStack, but that is not the main question right now.


Solution

  • KeyDB, in fact, only runs the IO operations and Redis protocol parsing operations in parallel. It processes the commands in serial, i.e. process commands one-by-one, and working threads are synced with a spin lock.

    That's why those simple set/get commands are blocked by a slow command. So even with KeyDB, you should NOT run slow command either and, the multiple threading won't help.

    UPDATE

    KeyDB can have multiple threads listen on the same IP:port, so that it can accept multiple connections in parallel, i.e. SO_REUSEPORT. Also it reads (including parsing received data into commands with redis protocol, i.e. RESP) and writes socket in parallel.

    While Redis only have a single thread, i.e. main thread, listen on the IP:port. By default, Redis reads and writes socket in a single thread. Since Redis 6.0, you can enable io-threads to make it write socket in parallel. Also, if you enable io-threads-do-reads, Redis will also reading and protocol parsing in parallel.