Search code examples
aerospikenosql

Aerospike: keep data as blob or use 'bins'?


I need to keep data in Aerospike. This engine which does support 'bins' ('bin' is like column in a row or field in a record). On the other hand I can keep my records as serialized blobs. Records are extracted from from database in atomic way. That is, I don't need to fetch some 'columns' of the record, I need record entirely.

The question is: what is the most efficient way of keeping data for such scenario in terms of performance? Keep it unserialized and use 'bins' to describe all record's fields, or store it as serialized blob in 1 column?


Solution

  • If you are sure that your only usecase is to fetch the full record, and never the individual bins, it is better to store as a single bin value. (Internally, multiple bins will need multiple mallocs beyond a size limit). Infact, you can set the namespace config option 'single-bin true' which will optimize things further. Be aware that once you set this config option it can never be unset even with a node restart. You have to clean the drives if you want to change this config. If the namespace is in-memory, obviously, this restriction is not applicable.

    In the future, if there is possibility of accessing sub-set of the bins, storing as bins is better. As it will save on the network I/O which will be much bigger than the malloc overhead.