Search code examples
python-2.7erlangriakerlang-shellerlang-ports

How to write a reduce phase function in erlang on riak database


I am having data a riak bucket i am getting data like the below by using python-riak client

<<"{\"META\": {\"campaign\": \"5IVUPHE42HP1NEYvKb7qSvpX2Cm\",
      \"createdat\": 1406978070.0,
      \"user_email\": \"[email protected]\"},
\"mode\": 2,
\"status\": \"success\"}">>

The above data format will be there for every key,

so from the map phase with python-riak client i am getting data like this:

[{'5IVUPHE42HP1NEYvKb7qSvpX2Cm': 1},
{'WL6iHLCgs492rFEFvqewzvCfFfj': 2},
{'5IVUPHE42HP1NEYvKb7qSvpX2Cm': 1}, 
{'5IVUPHE42HP1NEYvKb7qSvpX2Cm': 2}] 

so for the above data i have to write a reduce phase in erlang which should give output while using with the python-riak client like below:

{'5IVUPHE42HP1NEYvKb7qSvpX2Cm': {'ab_leads': 2, 'cp_leads': 1},
 'WL6iHLCgs492rFEFvqewzvCfFfj': {'ab_leads': 0, 'cp_leads': 1}}

so from the list of [{key,Value}] from the map phase, based upon this i have to write a map phase for checking conditions by introducing two new variables to the result phase as if from the map phase list if the {key,Value} the value having 0 or 1 then we have to increment or count that particluar keys for a new variable like ab_leads if it is 2 then we have to count for that particular keys for a new variable like cp_leads.

So for particular key from the list i have to count if that key is having 0 then ab_leads to be increased or if it is 2 then cp_leads should be increased.

so i've been trying like the below but this is not giving as i want and also i have to catch the previous list result and should add to the next list of values as what riak says how the reduce phase will take the values of min 20 per round

lists:foldl(fun({Key,Mode},Acc) -> if Mode == 0;Mode == 1 -> orddict:update_counter({Key,<<"ab_leads">>},1,Acc); true -> orddict:update_counter({Key,<<"cp_leads">>},1,Acc) end end,orddict:new(),G).

The above one is giving the result like this

[{{<<"a">>,<<"ab_leads">>},2},{{<<"a">>,<<"cp_leads">>},1}]

so i have to convert the above as i said above that i want the result like

[{Key,{ab_leads:1,cp_leads:2}}]     

Solution

  • If I understand you correctly, you are trying to get a total number keys that contain a 'mode' of 1 or 2 for each campaign.

    Although you have asked about the reduce function, I believe we must first take the map, and this is why:

    "The most important thing to understand is that the function defining the reduce phase may be evaluated multiple times, and the input of later evaluations will include the output of earlier evaluations." [http://docs.basho.com/riak/latest/dev/advanced/mapreduce/#MapReduce]

    So the easiest way to handle this is to make the output of the map look the same as the output of the reduce. So first get your map output looking like this:

    [{'5IVUPHE42HP1NEYvKb7qSvpX2Cm': {'ab_leads': 1, 'cp_leads': 0}}, ...]
    

    Clearly, this makes the reduce phase a little harder (welcome to map/reduce) and we are going to have those ab_leads and cp_leads being passed around all over the place and so a tuple may be easier to handle for the moment.

    [{'5IVUPHE42HP1NEYvKb7qSvpX2Cm': {1, 0}}, ...]
    

    Our reduce now looks something like this:

    lists:foldl(fun({Key,{Ab_leads, Cp_leads}}, Acc) -> 
        {Ab_leadsAcc, Cp_leadsAcc} = proplists:get_value(Key, Acc, {0, 0}),
        [ {Key, {Ab_leadsAcc + Ab_leads, Cp_leadsAcc + Cp_leads}} | proplists:delete(Key, Acc)]
    end,
    [],
    G).
    

    Note that the tuple made the reduce function much easier to craft but clearly you can keep the keys for future expandability but ensuring your reduce can handle a proplist. You can go back to orddics if you like, but proplists are more efficient when order doesn't matter.