Search code examples
javascriptdata-visualizationvega-lite

Vega-Lite: How to sum by a field and use it as a category


So my situation is the following. I have 60.000 images, and they were trained using 4 models. Each model tries to predict what is in the image, so I end up with a dataset containing where each image shows up 4 times. Now, what I'd like to group each image according to how many models were able to get it correct. In other words, one image might have been predicted correctly in all 4 models, and another incorrectly in all 4, so the first should be in category 4 and the second in category 1.

How can I do this using Vega-Lite (I know I could preprocess the data, but I'd like to do it directly with Vega-Lite). I've tried the following, but without success:

vl.markPoint()
    .data(data)
    .transform(
      vl.groupby('image_id'),
      vl.joinaggregate( [{
      "op": "sum",
      "field": "acc",
      "as": "totalacc"}]),
      vl.calculate("datum.totalacc").as('total')
    )
    .encode(
      vl.y().sum('acc'),
      vl.x().fieldQ('total'),
      vl.detail().fieldQ('image_id')
    ).render();

Solution

  • vl.groupby() by itself doesn't do anything to the data. I suspect what you probably want for your transforms is something like this:

        .transform(
          vl.groupby('image_id')
            .joinaggregate([{
              "op": "sum",
              "field": "acc",
              "as": "totalacc"
            }]),
          vl.calculate("datum.totalacc").as('total')
        )