Search code examples
mongodberlangcursor

MongoDB Erlang Driver find limit occurs where?


I'm trying to integrate MongoDB driver in Erlang.

After some coding it appears to me that the only way to limit the number of retrieved rows can only occurs when dealing with the cursor after the find()action.

Here's my code so far :

Cursor = mongo:find(Connection, Collection, Selector),
Result = case Limit of
              infinity ->
                 mc_cursor:rest(Cursor);
              _ ->
                 mc_cursor:take(Cursor, Limit)
         end,
mc_cursor:close(Cursor)
  • What I'm afraid of, is when the Collection will be huge, what will happen ?
  • Won't it be to huge to fetch and fit the memory ?
  • How the cursor is basically working ?
  • Or is there just a better way to limit the fetch ?

Solution

  • I think you could use the batch_size parameter. The following code is from mongo.erl file

    %% @doc Return projection of selected documents starting from Nth document in batches of batchsize.
    %%      0 batchsize means default batch size.
    %%      Negative batch size means one batch only.
    %%      Empty projection means full projection.
    -spec find(pid(), collection(), selector(), projector(), skip(), batchsize()) -> cursor(). % Action
    find(Connection, Coll, Selector, Projector, Skip, BatchSize) ->
        mc_action_man:read(Connection, #'query'{
            collection = Coll,
            selector = Selector,
            projector = Projector,
            skip = Skip,
            batchsize = BatchSize
        }).
    

    ===============

    reponse to the comments:

    In the mc_action_man.erl file, it still use cursor to save "current postion".

    read(Connection, Request = #'query'{collection = Collection, batchsize = BatchSize} ) ->
        {Cursor, Batch} = mc_connection_man:request(Connection, Request),
        mc_cursor:create(Connection, Collection, Cursor, BatchSize, Batch).
    

    In the mc_worker.erl, it is the the data actual send to the db, I think you could add write_log (ex: lager) code to monitor the actual request to find the problem.

    handle_call(Request, From, State = #state{socket = Socket, ets = Ets, conn_state = CS}) % read requests
        when is_record(Request, 'query'); is_record(Request, getmore) ->
        UpdReq = case is_record(Request, 'query') of
                     true -> Request#'query'{slaveok = CS#conn_state.read_mode =:= slave_ok};
                     false -> Request
                 end,
        {ok, Id} = mc_worker_logic:make_request(Socket, CS#conn_state.database, UpdReq),
        inet:setopts(Socket, [{active, once}]),
        RespFun = fun(Response) -> gen_server:reply(From, Response) end,  % save function, which will be called on response
        true = ets:insert_new(Ets, {Id, RespFun}),
        {noreply, State};