I have a table with an integer key (timestamp) contains the time when particular record should be removed from a database. There is also a cleaning query, which takes from this table records with expiration time less then now and removes them.
Erlang documentation says, that there are four types of table types: set
, ordered_set
, bag
, and duplicate_bag
.
set
is implemented using hash tables, so reading takes O(1) time complexity.ordered_set
is implemented using tree, so reading takes O(log(n)) time complexity, but it better works with consequent intervals.bag
implementation.ordered_set
seems ideal, but I can't use it because two records can have the same timestamp. So the question is:
How the bag
table is implemented and is it good with querying consequent intervals? If not, how can I get "ordered_bag
" functionality?
Mnesia's bag
is implemented using ETS
and DETS
, so as other table types [1]. Also, Mnesia does not support duplicate_bag
tables - you can see it from ducumentation [2]. Thus, we can conclude bag
in Mnesia is implemented as hash table and has constant lookup time, since ETS
and DETS
bag
is implemented as hash table [3]. [4] also says that set
and bag
are implemented as hash tables in Mnesia.
On the rest of the question:
No, bag
is not good with querying consequent intervals. To get an interval
from bag
table you must fully traverse it. I see two possible decisions to
that.
First, you can use additional ordered_set
table to keep order, as
@niahoo suggested. Thus, you will be able to efficiently query all timestamps that fall in an interval, and
then delete corresponding entries from your bag
table, which also will be efficient, since
you will know all keys by this point.
Second, you can use ordered_set
of
{timestamp, [values]}
. This will require additional manual job on inserting
and deleting single entry, but it will save you from creating additional table
if you only need to query them grouped by timestamp
.