Search code examples
dc.jscrossfilter

Plot the top X values in DCjs (not grouped/value counted)


I'd like to plot the top X values of a dimension in a row chart, ideally labeled using one dimension but using the value of another for the size of the bars.

Essentially a presentation of data like the following:

Sally: 1 Fred: 0.7 Bob: 0.5 Francis: 0.4 George: 0.2 Sam: 0.18 Susan: 0.16 Sarah: 0.15 Tom: 0.15 Simon: 0.14 ...

rowChart1 = dc.rowChart('#id')
valDim = ndx.dimension(function(d) {return d.data});
valGroup = valDim.group().reduceCount();
rowChart1.dimension(valDim).group(valGroup);

Plots the value counts, rather than the values themselves.

Specifically I'm looking to make a rowChart of the top N data points, where the length of the bars is determined by the value of the data points, not the number of data points with that value.

I.e. Sally would have her own bar, and it would be 100% of the x-axis, while Fred's bar would be 70% and Simon's bar would be 14% of the x-axis.


Solution

  • If I understand your question correctly, the conceptual problem may be the distinction between the crossfilter definition of dimension, which means "a column that you bin and filter on", versus the English/math definition of the word, which might mean "any column of data" or might mean a geometric direction on a chart.

    There's always at least one geometric "dimension" on every chart which is not associated with a "crossfilter dimension" because it is aggregated. In a line chart Y is driven by a group reduction/aggregation; in the row chart X is.

    I understand you have a column in your data which is a unique key, say name, which you want to map to the row Y axis, and you have a second column x, which you want encode in the row X axis without aggregation. You can use crossfilter's group.reduceSum() and since only one record will land in each bin, the sum of x is just x.

    Since name is a unique key, there will be only one x per name.

    Let's say you have data like this:

    const data = [
      {name: 'Sally', x: 1},
      {name: 'Fred', x: 0.7},
      {name: 'Bob', x: 0.5},
      {name: 'Francis', x: 0.4},
      {name: 'George', x: 0.2},
      {name: 'Sam', x: 0.18},
      {name: 'Susan', x: 0.16}
     // ...
    ];
    

    Then the crossfilter initialization might look like this:

    const xf = crossfilter(data),
      dim = xf.dimension(d => d.name), // bin and filter on this
      group = dim.group().reduceSum(d => d.x); // here be values
    

    and chart

    const rowChart = dc.rowChart('#row');
    
    rowChart
      .dimension(dim)
      .group(group)
      .render();
    

    demo row chart

    Demo fiddle

    This might sound really complicated if you just want to plot x against some names, but dc.js and crossfilter are optimized for the case where there will be filtering between charts. No matter what they draw, crossfilter will always filter the rows, sort the rows into buckets, and reduce those buckets.

    If you don't use filtering, then these libraries are overkill. But if you do want to filter, it's really nice to have a library with a data model that takes care of it.