Search code examples
jsonnestedexport-to-csvjq

Convert json to csv using jq, with array nested within array


How can I flatten this json array and write out to a csv file using jq so that each row contains all the fields i.e.

I'm a bit of a novice with jq but started with this, which does not capture the inner nested array

jq -r '(.[0] | keys_unsorted) as $keys | $keys, map([.[ $keys[] ]])[]'

[
   {
      "name":"PM2.5",
      "unit":"ug/m3",
      "values":[
         {
            "timestamp":"2020-11-16T13:10:21.135Z",
            "value":3.696
         },
         {
            "timestamp":"2020-11-16T13:10:31.649Z",
            "value":3.696
         },
         {
            "timestamp":"2020-11-16T13:10:42.35Z",
            "value":3.696
         },
         {
            "timestamp":"2020-11-16T13:10:52.953Z",
            "value":3.696
         }
      ]
   },
   {
      "name":"PM10",
      "unit":"ug/m3",
      "values":[
         {
            "timestamp":"2020-11-16T13:10:21.135Z",
            "value":7.596
         },
         {
            "timestamp":"2020-11-16T13:10:31.649Z",
            "value":7.596
         },
         {
            "timestamp":"2020-11-16T13:10:42.35Z",
            "value":7.596
         },
         {
            "timestamp":"2020-11-16T13:10:52.953Z",
            "value":7.596
         }
      ]
   }
]

To a csv as follows:

"name", "unit", "timestamp", "value"
"PM2.5", "ug/m3", "2020-11-16T13:10:21.135Z", 3.696
"PM2.5", "ug/m3", "2020-11-16T13:10:31.649Z", 3.696
"PM2.5", "ug/m3", "2020-11-16T13:10:42.35Z", 3.696
"PM2.5", "ug/m3", "2020-11-16T13:10:52.953Z", 3.696  
"PM10", "ug/m3", "2020-11-16T13:10:21.135Z", 7.596
"PM10", "ug/m3", "2020-11-16T13:10:31.649Z", 7.596
"PM10", "ug/m3", "2020-11-16T13:10:42.35Z", 7.596
"PM10", "ug/m3", "2020-11-16T13:10:52.953Z", 7.596

Solution

  • $ jq -r '
      [["name","unit","timestamp","value"]], 
       [ .[] 
         | [ .name, .unit ] + 
           (.values[] | [.timestamp,.value]) ] 
      | .[] 
      | @csv' input.json 
    "name","unit","timestamp","value"
    "PM2.5","ug/m3","2020-11-16T13:10:21.135Z",3.696
    "PM2.5","ug/m3","2020-11-16T13:10:31.649Z",3.696
    "PM2.5","ug/m3","2020-11-16T13:10:42.35Z",3.696
    "PM2.5","ug/m3","2020-11-16T13:10:52.953Z",3.696
    "PM10","ug/m3","2020-11-16T13:10:21.135Z",7.596
    "PM10","ug/m3","2020-11-16T13:10:31.649Z",7.596
    "PM10","ug/m3","2020-11-16T13:10:42.35Z",7.596
    "PM10","ug/m3","2020-11-16T13:10:52.953Z",7.596
    

    Builds an array of arrays, one per row including a header line, and then formats them all as CSV records.