Given a monotonically increasing property should be used as key or for exact match queries only, we plan to avoid index contention troubles by prepending a computed hash to the property value.
Practical example: importing data from RDBMS, where document id
is sequential and should be used for lookup. So, we calculate hash
of the id
and store {hash}|{id}
.
If this should work, what size of hash would you recommend? For example, if we take first 4 bytes of sha1, would this be good for effective index tablet splitting? Could not find info on this subject. Thank you in advance!
The size of the prefix is proportional to the max traffic you want to handle. So following the 500/50/5 rule, you'll want to have as many prefixes as traffic divided by 500. So if you are doing less than 500 writes/s then there is no need for a hash. If you are doing 1M writes/s with sequential ids, you'll want 2 base64 prefix digits.
You'll also want to ramp up your total traffic following the 500/50/5 rule.