I've been using sinter which does intersection of unordered integer sets. Is there any faster way of doing intersection given that I wouldn't mind sorting beforehand (or performing any other preprocessing)?
EDIT: Found some info [here][1]
EDIT2: Bounty for specific answer: is zinterstore faster than sinter? Benchmarking would be cool too.
In theory the intersection of lists has complexity O(N) where N is the cardinality of the smallest set.
Use SET (SINTER/SINTERSTORE) if have sparsed data / should keep RAM low (O(N*M)) and use BITSET(SETBIT/BITOP) in all other cases (O(N). Like in your edit one info.
The Redis BIT key operations have complexity O(N), there N is the cardinality of the smallest key. And the bitops has very best execution speed based on CPU cache (look at bitops.c
sources). So this can be the absolute winner in you have not sparsed data or if memory not important for you (here is more about strings in Redis).
Do not use ZSET (zinterstore) is you have a plain list of integers and want to intersect them. Sorted set in Redis is a complex structure their keys stored in ziplist
or skiplist
internal encodings. The last one used to store sorted scores but keys stored in other structures. In case of ZSET intersection always much complicated with comparison SET:
ZSET intersection: O(N * K) + O(M * log(M)) worst case with N being the smallest input sorted set, K being the number of input sorted sets and M being the number of elements in the resulting sorted set.
SET intersection: O(N * M) worst case where N is the cardinality of the smallest set and M is the number of sets. Actually math-based minimum in theory.
SET uses dict
/ intset
data structures to store data and in your case (unordered integer sets) intset
would be used. Intset
the most memory save the structure in Redis. And have the best read speed with comparison to ziplist
(doubly linked list, more about this data structure internals here).