I was hoping to use reductio to compute averages within my crossfilter groups. My dataset includes missing values (represented by null
) that I'd like to exclude when calculating the average. However, I don't see a way to tell reductio to exclude certain values, and it treats the null
values as 0.
I wrote a custom reduce function to accomplish this without using reductio:
function reduceAvg(attr) {
return {
init: function() {
return {
count: 0,
sum: 0,
avg: 0
};
},
add: function(reduction, record) {
if (record[attr] !== null) {
reduction.count += 1;
reduction.sum += record[attr];
if (reduction.count > 0) {
reduction.avg = reduction.sum / reduction.count;
}
else {
reduction.avg = 0;
}
}
return reduction;
},
remove: function(reduction, record) {
if (record[attr] !== null) {
reduction.count -= 1;
reduction.sum -= record[attr];
if (reduction.count > 0) {
reduction.avg = reduction.sum / reduction.count;
}
else {
reduction.avg = 0;
}
}
return reduction;
}
};
}
Is there a way to do this using reductio? Maybe using exception aggregation? I haven't fully wrapped my head around how exceptions work in reductio.
I think you should be able to average over 'myAttr' excluding null and undefined by doing:
reductio()
.filter(function(d) { return d[myAttr] !== null && d[myAttr] !== undefined; })
.avg(function(d) { return d[myAttr]; });
If that doesn't work as expected, please file an issue as it is a bug.