My data looks like this
field1,field2,value1,value2
a,b,1,1
b,a,2,2
c,a,3,5
b,c,6,7
d,a,6,7
I don't have a good way of rearranging that data so let's assume the data has to stay like this.
I want to create a dimension on field1
and field2
combined : a single dimension that would take the union of all values in both field1
and field2
(in my example, the values should be [a,b,c,d]
)
As a reduce function you can assume reduceSum
on value2
for example (allowing double counting for now).
(have tagged dc.js and reductio because it could be useful for users of those libraries)
First I need to point out that your data is denormalized, so the counts you get might be somewhat confusing, no matter what technique you use.
In standard usage of crossfilter, each row will be counted in exactly one bin, and all the bins in a group will add up to 100%. However, in your case, each row will be counted twice (unless the two fields are the same), so for example a pie chart wouldn't make any sense.
That said, the "tag dimension" feature is perfect for what you're trying to do.
The dimension declaration could be as simple as:
var tagDimension = cf.dimension(function(d) { return [d.field1,d.field2]; }, true);
Now each row will get counted twice - this dimension and its associated groups will act exactly as if each of the rows were duplicated, with one copy indexed by field1
and the other by field2
.
If you made a bar chart with this, say, the total count will be 2N minus the number of rows where field1 === field2
. If you click on bar 'b', all rows which have 'b' in either fields will get selected. This only affects groups built on this dimension, so any other charts will only see one copy of each row.