Search code examples
jsonjqcounter

Counting instances of a value using jq


I'm trying to use jq to parse the output of some tool. Specifically I'm looking for a count of various HTTP status codes in the output json. I.e. something like

jq -c '. | <filter>' test.json
{200: 1, 301: 1, 403: 1}

from the following input json

#test.json
[
    {
        "content-length": 45,
        "path": "/foo",
        "redirect": null,
        "status": 200
    },
    {
        "content-length": 40,
        "path": "/bar",
        "redirect": null,
        "status": 301
    },
    {
        "content-length": 1150,
        "path": "/baz",
        "redirect": null,
        "status": 403
    }
]

I could just loop through in bash with something like

$ for i in 200 301 403; do echo -n $i "    "; jq '[.[] | select(.status | tostring =="'$i'")] | length' test.json ; done
200     1
301     1
403     1

but that seems inefficient. Trying to loop in jq feels like the better way to go, but the actual syntax to do it is a bit beyond me. I haven't had any luck finding exapmples, and I haven't had any luck trying to interpret the man pages

$ jq '[200, 301, 403] as $s | {$s: [.[] | select(.status == $s)] | length}' test.json 
jq: error: syntax error, unexpected ':', expecting '}' (Unix shell quoting issues?) at <top-level>, line 1:
[200, 301, 403] as $s | {$s: [.[] | select(.status == $s)] | length}                           
jq: error: May need parentheses around object key expression at <top-level>, line 1:
[200, 301, 403] as $s | {$s: [.[] | select(.status == $s)] | length}                         
jq: 2 compile errors

The python equivalent to what I want to do, in case that's clearer, is

import json
from collections import Counter
dat = json.load(open("test.json"))
print(Counter(d["status"] for d in dat))'
# Counter({200: 1, 301: 1, 403: 1})

Solution

  • This works:

    reduce .[] as $row ({}; .[$row.status | tostring] += 1)
    

    The trick is realizing that you never really wanted select in the first place, that's a relic of the "bash is looping and deciding what status I'm looking for" way of working. tostring is because jq objects can't have numbers as keys, only strings.