I need to group mnesia records by sender_id, here is the record:
-record(pm,{sender_id, recipient_id, msg, time}).
basically I need to fetch all records where recipient_id=X
and group them by sender_id
what is the fastest way to do it?
Situation is obvious and clear I think.
There are a couple ways to solve this.
Get the results and sort them by sender_id
This is by far the simplest solution. Just get result (a list of records) as you normally would and sort the records by sender_id:
lists:sort(fun(A, B) ->
A#pm.sender_id < B#pm.sender_id
end, Results).
Use qlc:fold/3
This is a bit more involved, but it does seem a little more elegant the using a plain old lists:sort/2
. Instead of returning a list it returns a dict.
select_grouped_pms_for_recipient(RecipientId) ->
Q = qlc:q([E || E <- mnesia:table(pm), E#pm.recipient_id == RecipientId]),
qlc:fold(fun group/2, dict:new(), Q).
group(Record, Acc) ->
dict:append(Record#pm.sender_id, Record, Acc).
In the group/2
function we push the record onto a list of values for sender ID in the dict
accumulator. The select_grouped_pms_for_recipient/1
will return the resulting dict
containing lists of all the records grouped under the sender ID keys.
The Erlang mailing list has is a treasure trove of good questions and answers. I found this question on the mailing list, which is very similar to your question.
Which is faster?
Looking at these solutions I'm not sure there will be a big difference in performance. I imagine they are about the same, but with the qlc the lock is held during all the grouping logic is executed for each item received, so it may hold the Mnesia lock for a bit longer.