My data set is an array of the following form:
[
{ "DATE" : "2020-01-02", "COUNTRY" : "Spain", "COUNT" : 110 },
{ ... },
{ ... }
]
There are multiple countries and multiple days. There are no gaps in dates.
I want to inject field DAYS_PASSED
(and subsequently use it for the X axis)
using the following algorithm:
DAYS_PASSED
for the previous day for the same country
and assign it to variable TEMP
. (If the previous day does not exist, assume 0);DAYS_PASSED
using the following formula: if TEMP > 0, then DAYS_PASSED = TEMP + 1
else-if COUNT > 100 then DAYS_PASSED = 1
else DAYS_PASSED = 0
So far I have done this in a preprocessing step (outside of Vega-Lite) but I was wondering if it was possible to migrate the calculation to Vega-Lite, maybe by plugging-in in a JavaScript function somehow?
I would also like to be able to expose 100 (from the COUNT > 100
condition) in the
graph so that the user can tweak it to, say, 200.
You can do this with a series of transforms; for example:
"transform": [
{"calculate": "toDate(datum.DATE)", "as": "date"},
{"calculate": "datum.COUNT < 100", "as": "pre100"},
{
"joinaggregate": [{"op": "sum", "field": "pre100", "as": "offset"}],
"groupby": ["COUNTRY"]
},
{
"window": [{"op": "count", "as": "daysPassed"}],
"groupby": ["COUNTRY"],
"sort": [{"field": "date"}]
},
{"calculate": "max(0, datum.daysPassed - datum.offset)", "as": "daysPassed"}
],
Here is a more full example showing this for a small dataset (vega editor):
{
"data": {
"values": [
{"DATE": "2020-02-02", "COUNTRY": "Spain", "COUNT": 50},
{"DATE": "2020-02-03", "COUNTRY": "Spain", "COUNT": 70},
{"DATE": "2020-02-04", "COUNTRY": "Spain", "COUNT": 110},
{"DATE": "2020-02-05", "COUNTRY": "Spain", "COUNT": 150},
{"DATE": "2020-02-06", "COUNTRY": "Spain", "COUNT": 200},
{"DATE": "2020-02-02", "COUNTRY": "Italy", "COUNT": 90},
{"DATE": "2020-02-03", "COUNTRY": "Italy", "COUNT": 100},
{"DATE": "2020-02-04", "COUNTRY": "Italy", "COUNT": 140},
{"DATE": "2020-02-05", "COUNTRY": "Italy", "COUNT": 190},
{"DATE": "2020-02-06", "COUNTRY": "Italy", "COUNT": 250}
]
},
"transform": [
{"calculate": "toDate(datum.DATE)", "as": "date"},
{"calculate": "datum.COUNT < 100", "as": "pre100"},
{
"joinaggregate": [{"op": "sum", "field": "pre100", "as": "offset"}],
"groupby": ["COUNTRY"]
},
{
"window": [{"op": "count", "as": "daysPassed"}],
"groupby": ["COUNTRY"],
"sort": [{"field": "date"}]
},
{"calculate": "max(0, datum.daysPassed - datum.offset)", "as": "daysPassed"}
],
"concat": [
{
"mark": "line",
"encoding": {
"x": {"field": "DATE", "type": "temporal"},
"y": {"field": "COUNT", "type": "quantitative"},
"color": {"field": "COUNTRY", "type": "nominal"}
}
},
{
"mark": "line",
"transform": [{"filter": "datum.daysPassed > 0"}],
"encoding": {
"x": {"field": "daysPassed", "type": "quantitative"},
"y": {"field": "COUNT", "type": "quantitative"},
"color": {"field": "COUNTRY", "type": "nominal"}
}
}
]
}