Search code examples
jsongoogle-bigqueryjqreformattingjsonlines

Make one (sub-)JSON object appearing on one line by jq


In Cloud BigQuery, the accepted JSON format is:

One JSON object, including any nested/repeated fields, must appear on each line.

refer to: https://cloud.google.com/bigquery/data-formats#json_format

Now, given a json:

{
  "1": {
    "kind": "person",
    "fullName": "John Doe",
    "age": 22,
    "gender": "Male",
    "citiesLived": [
      {
        "place": "Seattle",
        "numberOfYears": 5
      },
      {
        "place": "Stockholm",
        "numberOfYears": 6
      }
    ]
  },
  "2": {
    "kind": "person",
    "fullName": "Jane Austen",
    "age": 24,
    "gender": "Female",
    "citiesLived": [
      {
        "place": "Los Angeles",
        "numberOfYears": 2
      },
      {
        "place": "Tokyo",
        "numberOfYears": 2
      }
    ]
  }
}

How to convert it into the following by jq?

{"kind": "person", "fullName": "John Doe", "age": 22, "gender": "Male", "citiesLived": [{ "place": "Seattle", "numberOfYears": 5}, {"place": "Stockholm", "numberOfYears": 6}]}
{"kind": "person", "fullName": "Jane Austen", "age": 24, "gender": "Female", "citiesLived": [{"place": "Los Angeles", "numberOfYears": 2}, {"place": "Tokyo", "numberOfYears": 2}]}

Solution

  • The key here is the "-c" option, which in effect tells jq to use the JSONLines output format.

    In your particular case, the solution is simply:

    jq -c '.[]'
    

    Your shell might even allow you to drop the quotation marks :-)