Search code examples
databasecassandranosqlkey-valuescylla

Key-value database with random value access


I'm looking for a database for storing binary data (16KB per document). Requirements:

  • high availability
  • random access of bytes, i.e. read bytes from 40 bytes to the end

I read the docs of ScyllaDB, RocksDb, and RiakKV, but I haven't found the second requirement. Right now app uses MySql with blobs, but achieved performance limits in this setup.

I think to use ScyllaDb/Cassandra data model, the primary key can be a UUID, and the secondary key would be number of several small chunks. In this model, a client can query a range of secondary keys and reject additional bytes.

Does exist database with the above requirements?


Solution

  • Scylla, Cassandra and DynamoDB (which all share a similar data model) do not have an ability to read a substring of a string attribute in the way you hoped that it would.

    However, all three let you do exactly what you said, and is indeed a good solution: You can split the 16-KB string into, say, sixteen 1-KB strings each stored in a separate item with the same partition key. You can then scan only a part of that partition depending on the bytes you want to read - all three databases have an efficient way to do this (the read from disk of the entire partition is sequential and efficient).

    You can argue that you want to read only bytes 1056-1237 and not bytes 1024-2047. However, in reality, everything from disks to cloud billing works in chunks, so even if you had the technical ability to query bytes 1056-1237 instead of 1024-2047 it might not have cost you less.