Search code examples
javascriptdc.jscrossfilter

How to use scatter chart with complex data?


I trying to implement scatter chart by example.

In example we can see creating dimension:

runDimension = ndx.dimension(function(d) {return [+d.Expt, +d.Run]; });

Example's data:

Expt Run Speed
1    1   850
1    2   740
1    3   900

I want to use same chart, but I have my data in next format:

[
 {
  "Timestamp":"2016-12-15T17:29:53Z",
  "idgame":"euro",
  "users":{
    "Jo": {
      "energy":200,
      "jump_height":0.5
     },
    "Bob": {
      "energy":220,
      "jump_height":0.35
     }
  }
 },
 {
  "Timestamp":"2016-12-15T17:29:55Z",
  "idgame":"euro",
  "users":{
    "Jo": {
      "energy":120,
      "jump_height":0.15
     },
    "Bob": {
      "energy":240,
      "jump_height":0.75
     }
  }
 }
]

I need to build next chart, where x-axis is timestamp and y-axis is jump_height:

enter image description here

My data is allready in crossfilter, so I can't change it. How can I create good dimension with current format?


Solution

  • I'm still not convinced this is worth the effort, versus biting the bullet and flattening your data and fixing up the other charts. If your data isn't flat you will be fighting crossfilter and dc.js every step of the way.

    That said, as usual, it's possible!

    We can't use the series chart, because that requires all the data to be present in one group. But since you want to produce multiple symbols from each row of data, an ordinary crossfilter group can't produce the data you need.

    Maybe we could use a fake group, but that would be complicated. Instead, let's produce a dimension and group for each user, and then sew them together using a composite chart.

    First, we need to parse those timestamps:

    data.forEach(function(d) {
      d.Timestamp = new Date(d.Timestamp);
    });
    

    Next, we'll retrieve a list of all users, by pulling the user keys from each event (timestamp), concatenating them, and then using a d3.set to unique them:

    // retrieve all users from all events
    var users = data.reduce(function(p, event) {
      return p.concat(Object.keys(event.users));
    }, []);
    users = d3.set(users).values();
    

    In the rest of the code, we'll assume there are the same users for each event. It's possible for them to be different, but it adds extra complexity, and this answer is complicated enough. Just ping me me if you need that feature!

    We'll create the chart, crossfilter, and a scale which will assign symbols to users:

    var chart = dc.compositeChart("#test");
    var ndx = crossfilter(data);
    var symbolScale = d3.scale.ordinal().range(d3.svg.symbolTypes);
    

    Now we can create the composite chart. (We'll add the scatter subcharts in the next step.)

    chart
      .width(768)
      .height(480)
      .x(d3.time.scale())
      .xUnits(d3.time.seconds)
      .elasticX(true)
      .elasticY(true)
      .brushOn(false)
      .clipPadding(10)
      .shareTitle(false) // allow default scatter title to work
      .shareColors(true) // but use different colors for subcharts
      .legend(dc.legend().x(350).y(350).itemHeight(13)
              .gap(5).horizontal(1).legendWidth(140).itemWidth(70));
    

    We set up the X axis with a time scale, with a resolution of seconds. Both axes have elastic. We need to share colors so that each subchart will be assigned its own color. (The legend is perhaps overspecified - I copied this from another example.)

    Finally we get to the meat of it. For each user, we'll create a subchart, and we'll tell the composite chart to compose all of them:

    chart.compose(users.map(function(user) {
      var userDimension = ndx.dimension(function(d) {
        return [d.Timestamp, d.users[user].jump_height];
      })
      var jumpGroup = userDimension.group();
      console.log(user, jumpGroup.all());
      var scatter = dc.scatterPlot(chart)
        .symbol(symbolScale(user))
        .dimension(userDimension)
        .group(jumpGroup)
        .colorAccessor(function() { return user; })
        .symbolSize(8)
        .highlightedSize(10);
      return scatter;
    }))
    

    We're creating a new dimension for each chart. This is because the dc.js scatter plot expects the keys to include both X and Y coordinates, and we can only access the Y coordinate (jump_height) once we know the user. Once we get past that, the group is simple.

    The chart will assign the symbol and color based on the user key. These both work the same; an ordinal scale will assign a new value from the range for each new value it encounters in the domain. The only difference is that we're using the default color scale, whereas we had to specify our own symbol scale.

    Here's a fiddle: https://jsfiddle.net/gordonwoodhull/3m4mv3xf/19/