Given input JSON like this (there's a lot more to it really, but I've stripped the fields that aren't of any interest:
{
"modules": {
"data": [
{
"id": "aod_play_area",
"data": [
{
"titles": {
"primary": "Primary",
"secondary": "Secondary"
}
}
]
},
{
"id": "aod_tracks",
"data": [
{
"titles": {
"primary": "First Artist name here",
"secondary": "First Track title here"
},
"uris": [
{
"id": "commercial-music-service-spotify",
"uri": "https://open.spotify.com/track/1234567890"
},
{
"id": "commercial-music-service-apple",
"uri": "https://music.apple.com/gb/album/xyz/1234?i=9876"
}
]
},
{
"titles": {
"primary": "Second Artist name here",
"secondary": "Second Track title here"
},
"uris": [
{
"id": "commercial-music-service-spotify",
"label": "Spotify",
"uri": "https://open.spotify.com/track/555555555555"
},
{
"id": "commercial-music-service-apple",
"label": "Apple Music",
"uri": "https://music.apple.com/gb/album/abc/5555?i=5555"
}
]
}
]
}
]
}
}
... and desired output which has two top-level properties, each populated from different elements within the modules.data[]
array, indexed by their .id
:
{
"title": "Primary - Secondary",
"tracks": [
{
"title": "First Track title",
"artist": "First Artist name",
"start": 3645,
"end": 3820,
"apple": "https://music.apple.com/gb/album/xyz/1234?i=9876",
"spotify": "https://open.spotify.com/track/1234567890"
},
{
"title": "Second Track title",
"artist": "Second Artist name",
"start": 3645,
"end": 3820,
"apple": "https://music.apple.com/gb/album/abc/5555?i=5555",
"spotify": "https://open.spotify.com/track/555555555555"
}
]
}
... what should my jq
query look like to pull data from those two objects within modules.data
? I can write queries to do one or the other, but not both, presumably because my first query has caused jq
to walk down one branch of the structure and I don't know how to make it "unwind" so that the second query still works.
Extracting the titles:
cat sample.json | jq '.modules.data.[] | {
title: select(.id == "aod_play_area").data[0] | "\(.titles.primary) - \(.titles.secondary)",
tracks: []
}'
Produces:
{
"title": "Primary - Secondary",
"tracks": []
}
Extracting just the tracks:
cat sample.json | jq '.modules.data.[] | {
title: "title",
tracks: select(.id == "aod_tracks").data | map({
title: .titles.primary,
artist: .titles.secondary,
start: .offset.start,
end: .offset.end,
apple: .uris[] | select(.id =="commercial-music-service-apple").uri,
spotify: .uris[] | select(.id =="commercial-music-service-spotify").uri
})
}'
Produces:
{
"title": "title",
"tracks": [
{
"title": "First Artist name here",
"artist": "First Track title here",
"start": null,
"end": null,
"apple": "https://music.apple.com/gb/album/xyz/1234?i=9876",
"spotify": "https://open.spotify.com/track/1234567890"
},
{
"title": "Second Artist name here",
"artist": "Second Track title here",
"start": null,
"end": null,
"apple": "https://music.apple.com/gb/album/abc/5555?i=5555",
"spotify": "https://open.spotify.com/track/555555555555"
}
]
}
Combining the two:
cat sample.json | jq '.modules.data.[] | {
title: select(.id == "aod_play_area").data[0] | "\(.titles.primary) - \(.titles.secondary)",
tracks: select(.id == "aod_tracks").data | map({
title: .titles.primary,
artist: .titles.secondary,
start: .offset.start,
end: .offset.end,
apple: .uris[] | select(.id =="commercial-music-service-apple").uri,
spotify: .uris[] | select(.id =="commercial-music-service-spotify").uri
})
}'
... produces no output at all. I believe this is because the first select
has taken us down one "branch" of the outer-most data, so the second select
doesn't find what it's looking for (as children of where it's ended up down that first branch). How should I rewrite my query to successfully extract all of the data of interest?
(I'm new to jq
, so apologies if I've misused any terminology)
You need to work on .modules.data
instead of .modules.data[]
:
jq '.modules.data |
{ title: (.[] | select(.id == "aod_play_area").data[0] | "\(.titles.primary) - \(.titles.secondary)"),
tracks: (.[] | select(.id == "aod_tracks").data | map({
title: .titles.primary,
artist: .titles.secondary,
start: .offset.start,
end: .offset.end,
apple: .uris[] | select(.id =="commercial-music-service-apple").uri,
spotify: .uris[] | select(.id =="commercial-music-service-spotify").uri
}))
}' sample.json
When you work on .modules.data[]
, you filter takes as input .modules.data[0]
, then .modules.data[1]
, so tries to construct two objects with missing information:
{ title: ..., tracks: empty }
{ title: empty, tracks: ... }
As each one contains empty
, which means the overall result is empty.