I have a list of movies like this:
[
{
"title":"X",
"genres":[
{
"tag":"Horror"
},
{
"tag":"Thriller"
},
{
"tag":"Mystery"
}
]
},
{
"title":"Zero Dark Thirty",
"genres":[
{
"tag":"Thriller"
},
{
"tag":"Drama"
},
{
"tag":"History"
},
{
"tag":"War"
}
]
}
]
I want to query all unique genres and count the number of movies, where the output looks like this:
{
"Horror":1,
"Thriller":2,
"Mystery":1,
"Drama":1,
"History":1,
"War":1
}
Is this possible with jq
?
Edit: I also need to be able to handle movies without genres. These can be ignored. For example:
[
{
"title":"Some movie",
"genres":[
{
"tag": null
}
]
},
{
"title":"Some other movies",
"genres":[
{
"tag":"Comedy"
},
{
"tag":"Drama"
}
]
}
]
Yes, it is.
map(.genres[].tag)
| group_by(.)
| map({ key:first, value:length })
| from_entries
Output:
{
"Drama": 1,
"History": 1,
"Horror": 1,
"Mystery": 1,
"Thriller": 2,
"War": 1
}
Alternatively, use a reduce
based approach and simply increase a counter:
reduce .[].genres[].tag as $genre ({}; .[$genre] += 1)
This is likely more efficient than building an array and grouping.
To select only non-null tags, filter through values
:
map(.genres[].tag | values)
| group_by(.)
| map({ key:first, value:length })
| from_entries
or
reduce (.[].genres[].tag|values) as $genre ({}; .[$genre] += 1)
values
is internally defined as select(. != null)
.