Search code examples
erlangets

retrieval of data from ETS table


I know that lookup time is constant for ETS tables. But I also heard that the table is kept outside of the process and when retrieving data, it needs to be moved to the process heap. So, this is expensive. But then, how to explain this:

18> {Time, [[{ok, Binary}]]} = timer:tc(ets, match, [utilo, {a, '$1'}]).
{0,
 [[{ok,<<255,216,255,225,63,254,69,120,105,102,0,0,73,
         73,42,0,8,0,0,0,10,0,14,...>>}]]}
19> size(Binary).
1759017

1.7 MB binary takes 0 time to be retrieved from the table!?

EDIT: After I saw Odobenus Rosmarus's answer, I decided to convert the binary to list. Here is the result:

1> {ok, B} = file:read_file("IMG_2171.JPG").
{ok,<<255,216,255,225,63,254,69,120,105,102,0,0,73,73,42,
      0,8,0,0,0,10,0,14,1,2,0,32,...>>}
2> size(B).
1986392
3> L = binary_to_list(B).
[255,216,255,225,63,254,69,120,105,102,0,0,73,73,42,0,8,0,0,
 0,10,0,14,1,2,0,32,0,0|...]
4> length(L).
1986392
5> ets:insert(utilo, {a, L}).
true
6> timer:tc(ets, match, [utilo, {a, '$1'}]).
{106000,
 [[[255,216,255,225,63,254,69,120,105,102,0,0,73,73,42,0,8,0,
    0,0,10,0,14,1,2|...]]]}

Now it takes 106000 microseconds to retrieve 1986392 long list from the table which is pretty fast, isn't it? Lists are 2 words per element. Thus the data is 4x1.7MB.

EDIT 2: I started a thread on erlang-question (http://groups.google.com/group/erlang-programming/browse_thread/thread/5581a8b5b27d4fe1) and it turns out that 0.1 second is pretty much the time it takes to do memcpy() (move the data to the process's heap). On the other hand Odobenus Rosmarus's answer explains why retrieving binary takes 0 time.


Solution

  • binaries itself (that longer than 64 bits) are stored in the special heap, outside of process heap.

    So, retrieval of binary from the ets table moves to process heap just 'Procbin' part of binary. (roughly it's pointer to start of binary in the binaries memory and size).