Search code examples
jsonjqtabulate

Relate elements in table form from Json file with jq


I'm new to jq and I have the following code to obtain tabulated the values for each element called Abc:

["Abc"], ( .. | objects | select(has("Abc")) | [.["Abc"]] ) | @tsv

This is the current output I get:

"Abc"
"4"
"2"
"1"
"9"
"3"
"2"
"4"
"9"

I would like to add 4 columns to the left to show for each Abc value the corresponding page, row and column. Additionally if possible as first column add a counter from 1 to number of "Abc" elements.

Below I show the current output, compared with the desired output and the structure of the Json file in order to clarify: enter image description here

The input Json file is below:

{
  "document": {
    "page": [
      {
        "@index": "0",
        "image": {
          "Abc": "4"
        }
      },
      {
        "@index": "1",
        "row": [
          {
            "column": [
              {
                "text": {
                  "Abc": "2"
                }
              }
            ]
          },
          {
            "column": [
              {
                "text": {
                  "Abc": "1"
                }
              },
              {
                "text": {
                  "Abc": "9"
                }
              }
            ]
          },
          {
            "column": [
              {
                "text": {
                  "Abc": "3"
                }
              }
            ]
          }
        ]
      },
      {
        "@index": "2",
        "row": [
          {
            "column": [
              {
                "text": {
                  "Abc": "2"
                }
              }
            ]
          },
          {
            "column": [
              {
                "text": {
                  "Abc": "4"
                }
              },
              {
                "text": {
                  "Abc": "9"
                }
              }
            ]
          }
        ]
      }
    ]
  }
}

I hope someone could help me. Thanks in advance.


Solution

  • The following solution uses paths and has several advantages, including brevity, simplicity, and that it can easily be adapted to a handle data that is in a different format.

    For clarity, we begin by defining a function that adds the row numbers:

    # add a sequential id, starting at 1
    def tsvRows(s):
      foreach s as $s (0; .+1; [.] + $s)
      | @tsv;
    
    (["counter", "page", "row", "column", "Abc"] | @tsv),
    tsvRows(paths as $p
      | select($p[-1] == "Abc")
      | getpath($p) as $v
      | $p
      | .[2] as $page
      | (if .[3] == "row" then .[4] else null end) as $row
      | (if .[5] == "column" then .[6] else null end) as $column
      | [$page, $row, $column, $v] )