Aerospike HDD/Memory usage

I'm exploring Aerospike as key-value DB with storing data on disk for safety. Please confirm, that I understand this correctly:

If in namespace configuration I set:
- storage-engine device
- memory-size 4G
- file /opt/aerospike/data/namespace.dat
- filesize 16G
- data-in-memory false

-> all data will be on disk only, "memory-size" is for indexes only (small usage), all data will be stored in multiple 16GB files (which will be creating automatically), and most important - every read query will trigger reading data from disk?

If in namespace configuration I set:
- storage-engine device
- memory-size 4G
- file /opt/aerospike/data/namespace.dat
- filesize 16G
- data-in-memory true

-> all data will be on disk and partly in memory, "memory-size" will act like cache and contain 4GB of most used data, all data will be stored in multiple 16GB files (which will be creating automatically), and most important - every read query will trigger checking data from memory and if missing -> reading from disk and adding to memory? What data will be in memory - most used or latest created?

If in namespace configuration I set:
- storage-engine memory
- memory-size 4G
- data-in-memory true

-> all data will be in memory only, I'm limited to 4GB of data and no more?

Solution

Aerospike doesn't shuffle data in and out of disk like first generation NoSQL databases do, ones that have a "cache-first" architecture. Aerospike's hybrid memory architecture is such that the primary index (metadata) is always in memory. Depending on the namespace configuration, the data is stored fully on disk or in memory. You define storage for each namespace. If it is in-memory all the data and metadata is in-memory, fully. if the namespace stores its data on a few devices (/dev/sdb, /dev/sdc) the primary index (metadata) is fully in memory and the data is fully on those SSDs.

(1) is data on HDD, and the configuration is correct. If you're using an SSD you probably want to use device instead of file. One thing that isn't true in your question is that Aerospike will first check the post-write-queue on a read.

Aerospike does block writes to optimize around the high-read / low-write performance of HDD and SSD. The size of the block is determined by the write-block-size config parameter (should be 1MB for a HDD). The records are first loaded into a streaming write buffer of an equivalent size. After the buffer is flushed to a block on disk, Aerospike doesn't get rid of this in-memory copy immediately; it remains part of the post-write queue (FIFO). By default, 256 of those blocks are in the queue per-device, or per-file (you can define multiple file lines as the storage device). If your usage pattern is such that reads follow closely after the writes, you'll be getting in-memory access instead of disk access. If your cache_read_pct metric is not single digits and you have DRAM to spare, you probably can benefit from raising the post-write-queue value (max of 2048 blocks per-device).

(2) is an in-memory namespace, persisted to disk. For both (1) and (2) you can use either file (for filesystem based storage) or device (for raw device). Both the primary index (metadata) and storage (data) are in memory for (2). All reads and writes come out of memory, and a secondary write-through goes to the persistence device.

filesize reserves the size of the persistence layer on the filesystem (if you chose to use file and not device). You can have multiple file lines, each of which will be sized from the start to the number given as filesize. memory-size is the maximum amount of memory used by the namespace. This isn't pre-reserved. Aerospike will grow and shrink in memory usage over time, with the maximum for the namespace being its memory-size.

Take a look at What's New in 3.11, specifically the section that touches on in-memory performance improvements. Tuning partition-tree-sprigs and partition-tree-locks will likely boost the performance of your in-memory namespaces.

(3) is a purely in-memory namespace, usually intended to be a cache. The 4G limit affects things such as stop-writes-pct, high-water-memory-pct as those are defined as a percentage of that limit (see evictions, expirations, stop-writes).

There's also a (4) special-case for counters called data-in-index. See storage engine configuration recipes.