Search code examples
data-structureserlangets

What are the differences between ETS, persistent_term and process dictionaries?


I know there are (at least) three ways of having a mutable state in Erlang:

  • ETS tables
  • persistent_term
  • Process dictionaries

The basic usage of them looks very similar to me:

% ETS
1> ets:new(table1, [named_table]).
2> ets:insert(table1, {thing}).
3> ets:lookup(table1, thing).
[{thing}]

% persistent_term
1> persistent_term:put(table2, thing).
2> persistent_term:get(table2).
thing

% Process dictionary
1> put(table3, thing).
2> get(table3).       
thing

What are the differences and pros/cons of using one over another?

I see that ETS behaves more like a map, but how does it differ from keeping maps in persistent_terms or process dictionaries?


Solution

  • persistent_term and ets expose a similar API, but they are different, I'll quote the manual:

    Persistent terms is an advanced feature and is not a general replacement for ETS tables.

    The differences lie in where are the terms stored and what happens on updates, ETS terms are copied to the process heap on read, where persistent_term terms are only referenced. This causes that when a item from the persistent_term is updated, all the processes that had a copy (reference) of this term need to actually copy it to their heap to continue using it.

    With huge terms, referencing them instead of copying them saves a lot of time, but the penalty of the updates is harsh.

    Also, persistent_term is only a map, whereas ETS may be a set, ordered set, bag or dupicate bag, ETS provide select and match semanticts too.

    The persistent_term is used as a optimized replacement for a very specific use case the ETS were being used to.

    Regarding the process dctionary, it's a local #{} inside the process, always available. ETS and persistent_term are available for every process, the local process dictionary is different for each process. Be very careful when using the process dictionary, as it hurts legibility.