I'm trying to extract the value of an JSON object using jq --stream
, because the real data can the size of multiple GigaBytes.
This is the JSON I'm using for my tests, where I want to extract the value of item
:
{
"other": "content here",
"item": {
"A": {
"B": "C"
}
},
"test": "test"
}
The jq
options I'm using:
jq --stream --null-input 'fromstream(inputs | select(.[0][0] == "item"))[]' example.json
However, I don't get any output with this command.
A strange thing I found is that when removing the object after the item
the above command seems to work:
{
"other": "content here",
"item": {
"A": {
"B": "C"
}
}
}
The result looks as expected:
❯ jq --stream --null-input 'fromstream(inputs | select(.[0][0] == "item"))[]' example.json
{
"A": {
"B": "C"
}
}
But as I cannot control the input JSON this is not the solution.
I'm using jq version 1.6 on MacOS.
You didn't truncate the stream, therefore after filtering it to only include the parts below .item
, fromstream
is missing the final back-tracking item [["item"]]
. Either add it manually at the end (not recommended, this would also include the top-level object in the result), or, much simpler, use 1 | truncate_stream
to strip the first level altogether:
jq --stream --null-input '
fromstream(1 | truncate_stream(inputs | select(.[0][0] == "item")))
' example.json
{
"A": {
"B": "C"
}
}
Alternatively, you can use reduce
and setpath
to build up the result object yourself:
jq --stream --null-input '
reduce inputs as $in (null;
if $in | .[0][0] == "item" and has(1) then setpath($in[0];$in[1]) else . end
)
' example.json
{
"item": {
"A": {
"B": "C"
}
}
}
To remove the top level object, either filter for .item
at the end, or, similarly to truncate_stream
, remove the path's first item using [1:]
to strip the first level:
jq --stream --null-input '
reduce inputs as $in (null;
if $in | .[0][0] == "item" and has(1) then setpath($in[0][1:];$in[1]) else . end
)
' example.json
{
"A": {
"B": "C"
}
}