I'm working on a data visualization that has an odd little bug:
It's a little tricky to see, but essentially, when I click on a point in the line chart, that point corresponds to a specific issue of a magazine. The choropleth updates to reflect geodata for that issue, but, critically, the geodata is for a sampled period that corresponds to the issue. Essentially, the choropleth will look the same for any issue between January-June or July-December of a given year.
As you can see, I have a key called Sampled Issue Date (for Geodata), and the value should be the date of the issue for which the geodata is based on (basically, they would get geographical distribution for one specific issue and call it representative of ALL data in a six month period.) Yet, when I initially click on an issue, I'm always getting the last sampled date in my data. All of the geodata is correct, and, annoyingly, all subsequent clicks display the correct information. So it's only that first click (after refreshing the page OR clearing an issue) that I have a problem.
Honestly, my code is a nightmare right now because I'm focused on debugging, but you can see my reducer for the remove function on GitHub which is also copy/pasted below:
// Reducer function for raw geodata
function geoReducerAdd(p, v) {
// console.log(p.sampled_issue_date, v.sampled_issue_date, state.periodEnding, state.periodStart)
++p.count
p.sampled_mail_subscriptions += v.sampled_mail_subscriptions
p.sampled_single_copy_sales += v.sampled_single_copy_sales
p.sampled_total_sales += v.sampled_total_sales
p.state_population = v.state_population // only valid for population viz
p.sampled_issue_date = v.sampled_issue_date
return p
}
function geoReducerRemove(p, v) {
const currDate = new Date(v.sampled_issue_date)
// if(currDate.getFullYear() === 1921) {
// console.log(currDate)
// }
currDate <= state.periodEnding && currDate >= state.periodStart ? console.log(v.sampled_issue_date, p.sampled_issue_date) : null
const dateToRender = currDate <= state.periodEnding && currDate >= state.periodStart ? v.sampled_issue_date : p.sampled_issue_date
--p.count
p.sampled_mail_subscriptions -= v.sampled_mail_subscriptions
p.sampled_single_copy_sales -= v.sampled_single_copy_sales
p.sampled_total_sales -= v.sampled_total_sales
p.state_population = v.state_population // only valid for population viz
p.sampled_issue_date = dateToRender
return p
}
// generic georeducer
function geoReducerDefault() {
return {
count: 0,
sampled_mail_subscriptions: 0,
sampled_single_copy_sales: 0,
sampled_total_sales: 0,
state_population: 0,
sampled_issue_date: ""
}
}
The problem could be somewhere else, but I don't think it's a crossfilter issue (I'm not running into the "two groups from the same dimension" problem for sure) and adding additional logic to the add reducer makes things even less predictable (understandably - I don't ever really need to render the sample date for all values anyway.) The point of this is that I'm completely lost about where the flaw in my logic is, and I'd love some help!
EDIT: Note that the reducers are for the reduce
method on a dc.js dimension, not the native javascript reducer! :D
Two crossfilters! Always fun to see that... but it can be tricky because nothing in dc.js directly supports that, except for the chart registry. You're on your own for filtering between different chart groups, and it can be tricky to map between data sets with different time resolutions and so on.
As I understand your app, when a date is selected in the line chart, the choropleth and accompanying text should have exactly one row from the geodata dataset selected per state.
The essential problem is that Crossfilter is not great at telling you which rows are in any given bin. So even though there's just one row selected, you don't know what it is!
This is the same problem that makes minimum, maximum, and median reductions surprisingly complicated. You often end up building new data structures to capture what crossfilter throws away in the name of efficiency.
I'll go with a general solution that's more that you need, but can be helpful in similar situations. The only alternative that I know is to go completely outside crossfilter and look in the original dataset. That's fine too, and maybe more efficient. But it can be buggy and it's nice to work within the system.
So let's keep track of which dates we've seen per bin. When we start out, every bin will have all the dates. Once a date is selected, there will be only one date (but not exactly the one that was selected, because of your two-crossfilter setup).
Instead of the sampled_issue_date
stuff, we'll keep track of an object called date_counts
now:
// Reducer function for raw geodata
function geoReducerAdd(p, v) {
// ...
const canonDate = new Date(v.sampled_issue_date).getTime()
p.date_counts[canonDate] = (p.date_counts[canonDate] || 0) + 1
return p
}
function geoReducerRemove(p, v) {
// ...
const canonDate = new Date(v.sampled_issue_date).getTime()
if(!--p.date_counts[canonDate])
delete p.date_counts[canonDate]
return p
}
// generic georeducer
function geoReducerDefault() {
return {
// ...
date_counts: {}
}
}
What does it do?
const canonDate = new Date(v.sampled_issue_date).getTime()
Maybe this is paranoid, but this canonicalizes the input dates by converting them to the number of milliseconds since 1970. I'm sure you'd be safe using the string dates directly, but who knows there could be a space or a zero or something.
You can't index an object with a date object, you have to convert it to an integer.
p.date_counts[canonDate] = (p.date_counts[canonDate] || 0) + 1
When we add a row, we'll check if we currently have a count for the row's date. If so, we'll use the count we have. Otherwise we'll default to zero. Then we'll add one.
if(!--p.date_counts[canonDate])
delete p.date_counts[canonDate]
When we remove a row, we know that we have a count for the date for that row (because crossfilter won't tell us it's removing the row unless it was added earlier). So we can go ahead and decrement the count. Then if it hits zero we can remove the entry.
Like I said, it's overkill. In your case, the count will only go to 1 and then drop to 0. But it's not much more expensive to this rather than just keep
When we render the side panel, there should only be one date left in date_counts
for that selected item.
console.assert(Object.keys(date_counts).length === 1) // only one entry
console.assert(Object.entries(date_counts)[0][1] === 1) // with count 1
document.getElementById('geo-issue-date').textContent = new Date(+Object.keys(date_counts)[0]).format('mmm dd, yyyy')
From a usability perspective, I would recommend not to filter(null)
on mouseleave, or if you really want to, then put it on a timeout which gets cancelled when you see a mouseenter. One should be able to "scrub" over the line chart and see the changes over time in the choropleth without accidentally switching back to the unfiltered colors.
I also noticed (and filed) an issue because I noticed that dots to the right of the mouse pointer are shown, making them difficult to click. The reason is that the dots are overlapping, so only a little sliver of a crescent is hoverable. At least with my trackpad, the click causes the pointer to travel leftward. (I can see the date go back a week in the tooltip and then return.) It's not as much of a problem when you're zoomed in.