Search code examples
jsonjmespath

Preserve hierarchy in JMESPath query


When querying some document, I'd like to select items down in the hierarchy, but otherwise preserve the parent names in the structure. For example when querying aws database members, I can use:

DBClusters[].DBClusterMembers[].[DBInstanceIdentifier,IsClusterWriter]

To get from:

{
    "DBClusters": [
        {
            "DatabaseName": "bazbar",
            "DBClusterMembers": [
                {
                    "IsClusterWriter": false,
                    "DBClusterParameterGroupStatus": "in-sync",
                    "PromotionTier": 2,
                    "DBInstanceIdentifier": "foobar"
                },
    ...

To:

[
    [
        "foobar",
        false
    ],
    ...

But how can I get the following response instead:

{
    "bazbar": [
        [
             "foobar",
             false
        ],
        ...

I.e. for every cluster, extract the the DatabaseName as the key name for the list of members?


Solution

  • Context

    • Jmespath query
    • How to generate a query that preserves nesting of objects
    • How to generate a dynamic object key that comes from the value of another object key-value pair

    Solution

    Example

    Assuming the following sample dataset ...

        {
            "dbclusters": [
                {
                    "databasename": "alpha",
                    "dbclustermembers": [
                        {
                            "isclusterwriter": false,
                            "dbinstanceidentifier": "foobar"
                        }
                        ,{
                            "isclusterwriter": true,
                            "dbinstanceidentifier": "doobar"
                        }
                    ]
                }
                ,{
                    "databasename": "bravo",
                    "dbclustermembers": [
                        {
                            "isclusterwriter": false,
                            "dbinstanceidentifier": "foobar"
                        }
                    ]
                }
                ,{
                    "databasename": "charlie",
                    "dbclustermembers": [
                        {
                            "isclusterwriter": false,
                            "dbinstanceidentifier": "foobar"
                        }
                    ]
                }
            ]
        }
    

    ... the following jmespath query ...

        @.dbclusters[].{"databasename":databasename
            ,"dbclustermembers":[dbclustermembers[*].isclusterwriter
            ,dbclustermembers[*].dbinstanceidentifier][]
            }
    

    ... produces the following result ...

        [
          {
            "databasename": "alpha",
            "dbclustermembers": [
              false,
              true,
              "foobar",
              "doobar"
            ]
          },
          {
            "databasename": "bravo",
            "dbclustermembers": [
              false,
              "foobar"
            ]
          },
          {
            "databasename": "charlie",
            "dbclustermembers": [
              false,
              "foobar"
            ]
          }
        ]
    

    Pitfalls

    • this approach does not produce the database name as the object key as requested in the OP