Is there a Vega/Vega-Lite transform which I can use to select the first n rows in data set?
Suppose I get a dataset from a URL such as:
Person | Height |
---|---|
Jeremy | 6.2 |
Alice | 6.0 |
Walter | 5.8 |
Amy | 5.6 |
Joe | 5.5 |
and I want to create a bar chart showing the height of only the three tallest people. Assume that we know for certain that the dataset from the URL is already sorted. Assume that we cannot change the data as returned by the URL.
I want to do something like this:
{
"$schema": "https://vega.github.io/schema/vega-lite/v5.json",
"data": {
"url": "heights.csv"
},
"transform": [
{"head": 3}
],
"mark": "bar",
"encoding": {
"x": {"field": "Person", "type": "nominal"},
"y": {"field": "Height", "type": "quantitative"}
}
}
only the head
transform does not actually exist - is there something else I can do to get the same effect?
The Vega-Lite documentation has an example along these lines in filtering top-k items.
Your case is a bit more specialized: you do not want to order based on rank, but rather based on the original ordering of the data. You can do this using a count-based window
transform followed by an appropriate filter
. For example (view in editor):
{
"data": {
"values": [
{"Person": "Jeremy", "Height": 6.2},
{"Person": "Alice", "Height": 6.0},
{"Person": "Walter", "Height": 5.8},
{"Person": "Amy", "Height": 5.6},
{"Person": "Joe", "Height": 5.5}
]
},
"transform": [
{"window": [{"op": "count", "as": "count"}]},
{"filter": "datum.count <= 3"}
],
"mark": "bar",
"encoding": {
"x": {"field": "Height", "type": "quantitative"},
"y": {"field": "Person", "type": "nominal", "sort": null}
}
}