Search code examples
jsonfilterjq

Filter jq for reshaping a JSON array into a nested representation of parent/child elements relationships


I have a JSON source like this, in which the root array can vary both in number of elements and in parent relationships, inferred from the values in “id” and “parent” keys (while the “type” key specifies that the parents are all of type == 2).

[
  { "id": 7, "parent": 3, "type": 1, "title": "Item 1" },
  { "id": 8, "parent": 3, "type": 2, "title": "Folder 1" },
  { "id": 9, "parent": 8, "type": 2, "title": "Folder 2" },
  { "id": 10, "parent": 9, "type": 1, "title": "Item 3" },
  { "id": 11, "parent": 9, "type": 2, "title": "Folder 3" },
  { "id": 12, "parent": 11, "type": 1, "title": "Item 4" },
  { "id": 13, "parent": 11, "type": 1, "title": "Item 5" },
  { "id": 15, "parent": 8, "type": 1, "title": "Item 2" },
  { "id": 16, "parent": 3, "type": 1, "title": "Item 6" }
]

I need a filter to be passed to jq (v.1.7.1 currently) that allows me to get a tree reshape for a nested parent/child representation of elements, like this:

[
  {
    "title": "Item 1",
    "id": 7,
    "type": 1
  },
  {
    "title": "Folder 1",
    "id": 8,
    "type": 2,
    "children": [
      {
        "title": "Folder 2",
        "id": 9,
        "type": 2,
        "children": [
          {
            "title": "Item 3",
            "id": 13,
            "type": 1
          },
          {
            "title": "Folder 3",
            "id": 14,
            "type": 2,
            "children": [
              {
                "title": "Item 4",
                "id": 15,
                "type": 1
              },
              {
                "title": "Item 5",
                "id": 16,
                "type": 1
              }
            ]
          }
        ]
      },
      {
        "title": "Item 2",
        "id": 11,
        "type": 1
      }
    ]
  },
  {
    "title": "Item 6",
    "id": 12,
    "type": 1
  }
]

Please don't mind that the id values in my two examples don't match: they are code snippets taken at different times. I will finally be able to resolve these inconsistencies precisely when I have the required filter :-)

I ask this as a novice jq learner...

Update - Work in progress

Before opening this help request I had obviously searched but to no avail for an already published jq filter that would be suitable for my case. Today I got the idea to look for a similar scenario developed in Javascript and I spotted this one
I then tried my hand at translating to jq directives the same approach and the result is exceeding my own success expectations :-)
Here's what I put together:

reduce .[] as $o ( [{}];
  map(
    if $o.parent == $o.id then
      .parents[$o.id] = ($o + {children: []} + (.parents[$o.id] // {}))
    else
      .parents[$o.id] = $o
    end
    |
    if .parents[$o.parent] then
      .parents[$o.parent].children += [.parents[$o.id]] | del(.parents[$o.id])
    end
  )
) | .[0].parents | map(select( . != null))

And this is the output I'm getting from it

[
  {
    "id": 7,
    "parent": 3,
    "type": 1,
    "title": "Item 1"
  },
  {
    "id": 8,
    "parent": 3,
    "type": 2,
    "title": "Folder 1",
    "children": [
      {
        "id": 9,
        "parent": 8,
        "type": 2,
        "title": "Folder 2"
      },
      {
        "id": 15,
        "parent": 8,
        "type": 1,
        "title": "Item 2"
      }
    ]
  },
  {
    "id": 10,
    "parent": 9,
    "type": 1,
    "title": "Item 3"
  },
  {
    "id": 11,
    "parent": 9,
    "type": 2,
    "title": "Folder 3",
    "children": [
      {
        "id": 12,
        "parent": 11,
        "type": 1,
        "title": "Item 4"
      },
      {
        "id": 13,
        "parent": 11,
        "type": 1,
        "title": "Item 5"
      }
    ]
  },
  {
    "id": 16,
    "parent": 3,
    "type": 1,
    "title": "Item 6"
  }
]

In fact, all elements are in the expected place but I am sure my filter it is largely optimizable. In the last line I resolved to those ugly .[0].parents | map(select( . != null)) to clean up from unwanted null elements. In addition to this, I also have to make the "parent" keys disappear from each element.
There will certainly be ways to do the same already in the construction phase.
I will attempt to learn how to improve the whole thing…


Solution

  • Try this :

    jq 'map(.id |= tostring | .parent |= tostring) # Convert .id/.parent to string
      | INDEX(.[]; .id) as $r # $r means remaining to be processed.
      | map(.parent as $a | select(any($r[];.id == $a)|not)) as $R # $R means result
      | ($r | del(.[$R | .. | objects | .id])) # Remove processed from $r
      | {R:$R,r:.} # Temporary object
      | until((.r | length) == 0;
            reduce (.R as $R | .r | map(. as $o|select(any($R | .. | objects | $o.parent == .id))))[] as $o ({R, r}; # Find next candidates
              .r |= del(.[$o.id]) # Remove from reminaing
            | .R |= walk(if type == "object" and .id == $o.parent
                         then .children+=[$o]
                         end
                        ) # Add to the results
            )
        ).R
      | walk(if type == "object" then .id |= tonumber | .parent |= tonumber end)
    ' input.json