Search code examples
jsonkeyjqdepth

how to count the number of keys in an json in front of a nested scalar wih jq


I have deep nested jsons and I want to count how many keys I have in front of a scalar with jq. Here a very simplified example

{
    "one": {
        "two": {
            "three": [
                {
                    "four": [
                        {
                            "five": [
                                {
                                    "six": [
                                        {
                                            "sevenA": {
                                                "eightA": [
                                                    {
                                                        "nineA": "Blub"
                                                    },
                                                    {
                                                        "nineB": "def"
                                                    },
                                                    {
                                                        "nineC": "foo"
                                                    },
                                                    {
                                                        "nineD": 22
                                                    }
                                                ]
                                            }
                                        },
                                        {
                                            "sevenB": {
                                                "eightB": [
                                                    {
                                                        "nineE": "Bla"
                                                    },
                                                    {
                                                        "nineF": "int"
                                                    },
                                                    {
                                                        "nineG": "s"
                                                    },
                                                    {
                                                        "nineH": 60
                                                    }
                                                ]
                                            }
                                        }
                                    ]
                                }
                            ]
                        }
                    ]
                }
            ]
        }
    }
}

With the following command I am able to count keys and arrays in front of the scalar.

jq '[paths(scalars)] | map(length)'
[
  14,
  14,
  14,
  14,
  14,
  14,
  14,
  14
] 

like this

  "one.two.three.[].four.[].five.[].six.[].sevenA.eightA.[].nineA",
  "one.two.three.[].four.[].five.[].six.[].sevenA.eightA.[].nineB",
  "one.two.three.[].four.[].five.[].six.[].sevenA.eightA.[].nineC",
  "one.two.three.[].four.[].five.[].six.[].sevenA.eightA.[].nineD",
  "one.two.three.[].four.[].five.[].six.[].sevenB.eightB.[].nineE",
  "one.two.three.[].four.[].five.[].six.[].sevenB.eightB.[].nineF",
  "one.two.three.[].four.[].five.[].six.[].sevenB.eightB.[].nineG",
  "one.two.three.[].four.[].five.[].six.[].sevenB.eightB.[].nineH"

So I get as result 14 for my example above.

But I would like to count only the keys like this.

".one.two.three[].four[].five[].six[].sevenB.eightB[].nineE"
".one.two.three[].four[].five[].six[].sevenB.eightB[].nineF"
".one.two.three[].four[].five[].six[].sevenB.eightB[].nineG"
".one.two.three[].four[].five[].six[].sevenB.eightB[].nineH"
".one.two.three[].four[].five[].six[].sevenB.eightB[].nineE"
".one.two.three[].four[].five[].six[].sevenB.eightB[].nineF"
".one.two.three[].four[].five[].six[].sevenB.eightB[].nineG"
".one.two.three[].four[].five[].six[].sevenB.eightB[].nineH"

that I count 9 keys in front of the scalar or 8 keys in front of the key value pair for the scalar.

Could somebody maybe help me how I could do this with jq?


Solution

  • To produce a stream of key-depths, you could write:

    jq 'paths(scalars) | map(strings) | length' input.json
    

    Or with an eye on potential efficiency enhancements, you could use the generic stream-oriented count/1:

    def count(stream):
      reduce stream as $_ (0; .+1);
    
    paths(scalars) | count(.[] | strings)
    

    Or to compute the maximum key depth using a generic stream-oriented "max" function:

    def count(stream):
      reduce stream as $_ (0; .+1);
    
    # max(empty) is empty
    def max(stream):
      reduce (stream | [.]) as $x (null;
        if . == null or $x > . then $x else . end )
      | if . == null then empty else .[0] end;
    
    max(paths(scalars) | count(.[] | strings))