Search code examples
sqlarraysjsonpostgresqlrecursive-query

Extract JSON values recursivey in postgres


I have some JSON data stored in a column. I want to parse the json data and extract all the values against a particular key.

Here's my sample data:

    {
  "fragments": [
    {
      "fragments": [
        {
          "fragments": [
            {
              "fragments": [],
              "fragmentName": "D"
            },
            {
              "fragments": [],
              "fragmentName": "E"
            },
            {
              "fragments": [],
              "fragmentName": "F"
            }
          ],
          "fragmentName": "C"
        }
      ],
      "fragmentName": "B"
    }
  ],
  "fragmentName": "A"
}

Expected output:

D, E, F, C, B, A

I want to extract all fragmentName values from the above JSON.

I have gone through the below stacks, but haven't found anything useful: Collect Recursive JSON Keys In Postgres Postgres recursive query with row_to_json

Edited:

Here's one approach I have tried on the above stacks:

WITH RECURSIVE key_and_value_recursive(key, value) AS (
  SELECT
    t.key,
    t.value
  FROM temp_frg_mapping, json_each(temp_frg_mapping.info::json) AS t
  WHERE id=2

  UNION ALL

  SELECT
    t.key,
    t.value
  FROM key_and_value_recursive,
    json_each(CASE 
      WHEN json_typeof(key_and_value_recursive.value) <> 'object' THEN '{}' :: JSON
      ELSE key_and_value_recursive.value
    END) AS t
    
)
SELECT *
FROM key_and_value_recursive;

Output: SQL query output. Row 1: columns "key", "value". Row 2: "fragment", (complex nested object). Row 3: "fragmentName", "A"

Getting only 0 level nesting.


Solution

  • I would use a recursive query, but with jsonb_array_elements():

    with recursive cte as (
        select id, info ->> 'fragmentName' as val, info -> 'fragments' as info, 1 lvl 
        from mytable 
        where id = 2
        union all
        select c.id, x.info ->> 'fragmentName', x.info -> 'fragments', c.lvl + 1
        from cte c
        cross join lateral jsonb_array_elements(c.info) as x(info)
        where c.info is not null
    )
    select id, val, lvl
    from cte
    where val is not null
    

    The query traverses the object depth-first; at each step of the way, we unnest the json array and check if a fragment name is available. We don't need to check the types of the returned values: we just use the standard functions, until the data exhausts.

    Demo on DB Fiddle

    Sample data:

    {
        "fragments": [
            {
                "fragments": [
                    {
                        "fragments": [
                            {
                                "fragments": [
                                ],
                                "fragmentName": "D"
                            },
                            {
                                "fragments": [
                                ],
                                "fragmentName": "E"
                            },
                            {
                                "fragments": [
                                ],
                                "fragmentName": "F"
                            }
                        ],
                        "fragmentName": "C"
                    }
                ],
                "fragmentName": "B"
            }
        ],
        "fragmentName": "A"
    }
    

    Results:

    id | val | lvl
    -: | :-- | --:
     2 | A   |   1
     2 | B   |   2
     2 | C   |   3
     2 | D   |   4
     2 | E   |   4
     2 | F   |   4