Search code examples
jsoncsvexport-to-csvjq

Export json via jq to csv


I have this output as test.json ( Its an AWS extract, but I have changed the names )

[
    {
        "InstanceId": "I-1234",
        "Vol": "vol-5678",
        "Delete": false,
        "State": "in-use",
        "Tags": [
            {
                "Key": "Size",
                "Value": "large"
            },
            {
                "Key": "Colour",
                "Value": "red"
            },
            {
                "Key": "Shape",
                "Value": "square"
            },
            {
                "Key": "Weight",
                "Value": "light"
            }
        ]
    }
]

I want to export specific fields, including all tags to a csv, so it looks like this:

id,vol,state,size,colour,shape,weight
value,value,value,value,value,value,value

I have run this:

cat test.json | jq -c ' { id: .[].InstanceId, vol: .[].Vol, tags: .[].Tags | map ( [ .Key, .Value] | join (":")) | @csv } ' >> test.csv

And it looks like this:

cat test.csv
{"id":"I-1234","vol":"vol-5678","tags":"\"Size:large\",\"Colour:red\",\"Shape:square\",\"Weight:25kg\""}

if I open in Excel, looks like:

{"id":"I-1234"  vol:"vol-5678"  tags:"\"Size:large\"    \"Colour:red\"  \"Shape:square\"    \"Weight:25kg\""}

I will be looping this over many aws resources, and would like to keep appending to csv.

I want to remove 
{ } at beginning and end.

the key description I would like at top as a header, rather than to the left of the value..

    so for: "id":"I-1234"   vol:"vol-5678"
    I would like
    id, vol
    I-1234, vol-5678

and the same with the Tags
remove the Array Name: "tags:" ( think its the array name, I'm not a developer, infrastructure dude! ) and just leave
Size,Colour,Shape,Weight, ...
large,red,square,25kg, ...

Can anyone help, point me in the right direction ..
thanks .. :)

Solution

  • jq -r '
      ["Size","Colour","Shape","Weight"] as $Keys
      | (["id", "vol"] + ($Keys|map(ascii_downcase))),
        ( .[]
         | (.Tags|from_entries) as $dict
         | [.InstanceId, .Vol, $dict[$Keys[]]] )
      | @csv
    '
    

    This will produce valid CSV, with the columns in the desired order, irrespective of the ordering of the items in the .Tags array.

    If you don't want the strings in the rows to be quoted, then (at the risk of not having valid CSV) one option to consider would be replacing @csv above by join(","). Alternatively, you might wish to consider using @tsv and then replacing the tabs by commas (e.g. using sed or tror even jq :-).