I am currently designing an Aerospike Cluster that handles many relations and will grow very large very quickly. I have found many references in the aerospike documentation to the digest that is generated when retrieving a key using the python client, but none that show its usefulness outside of saving memory.
From the documentation :: A digest is a Hash of the key. The keys is hashed using the RIPEMD-160 algorithm wich will take a key of any length and it will always return a digest 20 bytes in size. If you have a long key, say 200 bytes, obtaining the digest for that key will allow you to improve the wire performance by saving 180 bytes.
My question is, does the digest increase look-up times? and is it worth storing the digest in other sets in order to create relationships?
The digest
is not generated when retrieving a key, rather it is calculated whenever you initiate a key
in the client and that digest will be used to communicate with the cluster and locate the record.
By default, even the actual key is not saved along with the record data. So internally all the look-ups are performed using the digests anyway.
From the documentation:
In the application, each record will have a key associated with it. This key is what the application will use to read or write the record.
However, when the key is sent to the database, the key (together with the set information) is hashed into a 160-bit digest. Within the database, the digest is used address the record for all operations.
The key is used primarily in the application, while the digest is primarily used for addressing the record in the database.
You will not need to use the digests directly. When you create relationships you will also create a secondary index for performance and that will work based on hashes anyway so it does not make a difference to use the digest instead of the key.
You can also try to model the relationships as complex or large data types within the same record.