Search code examples
mapreducedc.jscrossfilterreductio

custom reduce functions in crossfilter on 2 fields


My data looks like this

field1,field2,value1,value2
a,b,1,1
b,a,2,2
c,a,3,5
b,c,6,7
d,a,6,7

The ultimate goal is to get value1+value2 for each distinct value of field1 and field2 : {a:15(=1+2+5+7),b:9(=1+2+6),c:10(=3+7),d:6(=6)}

I don't have a good way of rearranging that data so let's assume the data has to stay like this.

Based on this previous question (Thanks @Gordon), I mapped using :

cf.dimension(function(d) { return [d.field1,d.field2]; }, true);

But I am left a bit puzzled as to how to write the custom reduce functions for my use case. Main question is : from within the reduceAdd and reduceRemove functions, how do I know which key is currently "worked on" ? i.e in my case, how do I know whether I'm supposed to take value1 or value2 into account in my sum ?

(have tagged dc.js and reductio because it could be useful for users of those libraries)


Solution

  • OK so I ended up doing the following for defining the group :

    reduceAdd: (p, v) => {
        if (!p.hasOwnProperty(v.field1)) {
            p[v.field1] = 0;
        }
        if (!p.hasOwnProperty(v.field2)) {
            p[v.field2] = 0;
        }
        p[v.field1] += +v.value1;
        p[v.field2] += +v.value2;
        return p;
    }
    reduceRemove: (p, v) => {
        p[v.field1] -= +v.value1;
        p[v.field2] -= +v.value2;
        return p;
    }
    reduceInitial: () => {
        return {}
    }
    

    And when you use the group in a chart, you just change the valueAccessor to be (d) => d.value[d.key] instead of the usual (d) => d.value

    Small inefficicency as you store more data than you need to in the value fields but if you don't have millions of distinct values it's basically negligible.