Search code examples
vega-lite

vega-lite: how to aggregate by week


I have seen that it's possible aggregate using several time units, in example by month, but not by week.

And I have seen that in vega it's possible to customize the time unit https://vega.github.io/vega/docs/transforms/timeunit/#chronological-time-units

Is it possible to use it in vega-lite and aggregate by week, and transform in example this aggregation from month to week?

Thank you


Solution

  • You can group by week using a monthdate timeUnit with a step size of 7:

    "timeUnit": {"unit": "monthdate", "step": 7}
    

    For example:

    {
      "$schema": "https://vega.github.io/schema/vega-lite/v4.json",
      "data": {"url": "data/seattle-temps.csv"},
      "mark": "line",
      "encoding": {
        "x": {"timeUnit": {"unit": "yearmonthdate", "step": 7}, "field": "date", "type": "temporal"},
        "y": {"aggregate": "mean", "field": "temp", "type": "quantitative"}
      }
    }
    

    enter image description here

    Note, however, that this starts a new week at the beginning of each month, which means if you do a heatmap by day of week and week there are gaps:

    {
      "$schema": "https://vega.github.io/schema/vega-lite/v4.json",
      "data": {"url": "data/seattle-temps.csv"},
      "mark": "rect",
      "encoding": {
        "y": {"timeUnit": "day", "field": "date", "type": "ordinal"},
        "x": {"timeUnit": {"unit": "yearmonthdate", "step": 7}, "field": "date", "type": "ordinal"},
        "color": {"aggregate": "mean", "field": "temp", "type": "quantitative"}
      }
    }
    

    enter image description here

    If you want more fine-grained control over where weeks start, that's unfortunately not expressible as a timeUnit, but you can take advantage of Vega-Lite's full transform syntax to make more customized aggregates. For example, here we compute the week-of-year by counting Sundays in the data:

    {
      "$schema": "https://vega.github.io/schema/vega-lite/v4.json",
      "data": {"url": "data/seattle-temps.csv"},
      "transform": [
        {"timeUnit": "yearmonthdate", "field": "date", "as": "date"},
        {
          "aggregate": [{"op": "mean", "field": "temp", "as": "temp"}],
          "groupby": ["date"]
        },
        {"calculate": "day(datum.date) == 0", "as": "sundays"},
        {
          "window": [{"op": "sum", "field": "sundays", "as": "week"}],
          "sort": "date"
        }
      ],
      "mark": "rect",
      "encoding": {
        "y": {"timeUnit": "day", "field": "date", "type": "ordinal", "title": "Day of Week"},
        "x": {"field": "week", "type": "ordinal", "title": "Week of year"},
        "color": {"aggregate": "mean", "field": "temp", "type": "quantitative"}
      }
    }
    

    enter image description here