Search code examples
javamapreduceriaksorting

Riak/Java - Best practices for MapReduce query on Secondary Indexes with AND conditions and ordering


I'm trying to implement a Map/Reduce function on Riak using Java and Secondary Indexes. Specifically, I'm trying to implement an AND condition + sorting results upon a specific index key. This function will be used in crowded buckets (in the order of hundred milion stored items).

While Riak doesn't natively support AND conditions and sorting, I would like to hear different points of view on how to implement this (taking in consideration performance issues on a such large bucket).

Suppose I've the following data:

key: key1
index-field1_bin: car
index-field2_int: 1

key: key2
index-field1_bin: car
index-field2_int: 3

key: key3
index-field1_bin: bike
index-field2_int: 4

key: key4
index-field1_bin: car
index-field2_int: 2

How would you retrieve, in Java, items that satisfy the following condition:

index-field1_bin == car
3 <= index-field2_int <= 4

and then sorting them like index-field2_int ASC.

Thanks


Solution

  • I may have find a solution, but I still need to do some serious benchmarking on it.

    IndexQuery iq = new BinValueQuery(BinIndex.named("field1"),
            "bucketName", "car");
    Function mapFunction = new JSSourceFunction(
            "function(v) {" +
                "var range = v.values[0].metadata.index.field2;" +
                "if (range <= 4 && range >= 2) {" +
                    "return [v.values[0]];" +
                "}" +
                "return [];" +
            "}");
    Function reduceFunction = new JSSourceFunction(
            "function(v) {" +
                "return [v.sort(function(a, b) {" +
                                     "return a.metadata.index.field2 - b.metadata.index.field2;" +
                                "}" +
                         ")];" +
             "}");
    MapReduceResult result = RiakUtils.getClient().mapReduce(iq)
                                .addMapPhase(mapFunction)
                                .addReducePhase(reduceFunction)
                                .execute();
    
    // Print the results
    System.out.println(result.getResultRaw());
    

    Basically a query gets all the "car" items, and then I filter the items with their range (map) and sort them (reduce) with a MapReduce operation.