I'm working with "airplane" data set from this reference http://square.github.io/crossfilter/
date,delay,distance,origin,destination
01010001,14,405,MCI,MDW
01010530,-11,370,LAX,PHX
...
// Create the crossfilter for the relevant dimensions and groups.
var flight = crossfilter(flights),
all = flight.groupAll(),
date = flight.dimension(function(d) { return d.date; }),
dates = date.group(d3.time.day),
hour = flight.dimension(function(d) { return d.date.getHours() + d.date.getMinutes() / 60; }),
hours = hour.group(Math.floor),
delay = flight.dimension(function(d) { return Math.max(-60, Math.min(149, d.delay)); }),
delays = delay.group(function(d) { return Math.floor(d / 10) * 10; }),
distance = flight.dimension(function(d) { return Math.min(1999, d.distance); }),
distances = distance.group(function(d) { return Math.floor(d / 50) * 50; });
Following document of Crossfilter, "groups don't observe the filters on their own dimension" => we can get filtered records from groups that theirs dimension are not filtered at this moment, can't we?
I have performed some test but this is not correct:
console.dir(date.group().all()); // 50895 records
console.dir(distance.group().all()); // 297 records
date.filter([new Date(2001, 1, 1), new Date(2001, 2, 1)]);
console.dir(date.group().all()); // 50895 records => this number still the same because we are filtering on its dimension
console.dir(distance.group().all()); // 297 records => but this number still the same too. I don't know why
Could you please explain for me why number of "distance.group().all()" still the same as before we perform the filter? Am I missing something here?
If we really cannot get "filtered records" from "distance dimension" by this way, how can I achive this?
Thanks.
So, yes, this is the expected behavior.
Crossfilter will create a "bin" in the group for every value it finds by applying the dimension key and group key functions. Then when a filter is applied, it will apply the reduce-remove function, which by default subtracts the count of rows removed.
The result is that empty bins still exist, but they have a value of 0.
EDIT: here is the Crossfilter Gotchas entry with further explanation.
If you want to remove the zeros, you can use a "fake group" to do that.
function remove_empty_bins(source_group) {
return {
all:function () {
return source_group.all().filter(function(d) {
//return Math.abs(d.value) > 0.00001; // if using floating-point numbers
return d.value !== 0; // if integers only
});
}
};
}
https://github.com/dc-js/dc.js/wiki/FAQ#remove-empty-bins
This function wraps the group in an object which implements .all()
by calling source_group.all()
and then filters the result. So if you're using dc.js you could supply this fake group to your chart like so:
chart.group(remove_empty_bins(yourGroup));