Search code examples
pythonjsonpandascsvconverters

JSON to CSV iterate through nested objects and map value to key


I do have a json in the format of (but with even more/deeper nested objects)

{
    "Group": {
        "Group1": {
            "GroupA": {
                "value": "#fffff",
                "type": "color"

            },
            "GroupB": {
                "value": "#c2c2c2",
                "type": "color"

            },
            "GroupC": {
                "Group_XY": {
                    "value": "#aw13ma",
                    "type": "color"
                }

            }

        },
        "Group2":{
            "GroupA": {
                "value": "#ff123f",
                "type": "color"

            },
            "GroupB": {
                "value": "#fffff",
                "type": "color"

            },
            "GroupD": {
                "value": "#ababab",
                "type": "color"
                
            }

        },
        "Group3": {
            "GroupA": {
                "value": "#fffff",
                "type": "color"

            },
            "GroupC": {
                "Group_XY": {
                    "value": "#12a3ma",
                    "type": "color"
                }

            },
            "GroupD": {
                "value": "#fffff",
                "type": "color"
                
            }

        },
        "Group4": {
            "GroupA": {

            },
            "GroupB": { 
                "value": "#c2c2c2",
                "type": "color"

            },
            "GroupE": {
                "Group_XX": {
                    "Group_A1": {
                        "value": "#12a3ma",
                        "type": "color"
                    }
                }
                
            }

        }
    }  
}

what i want to achieve is a csv/table output to compare:

Group1 Group2 Group3 Group4
GroupA #fffff #ff123f #fffff
GroupB #c2c2c2 #fffff #c2c2c2
GroupC
Group_XY #aw13ma #12a3ma
GroupD #fffff
GroupE
Group_XX
Group_A1 #12a3ma

so i figured that python and pandas might be the correct way to kick it off.

import pandas as pd
import json
with open('colorDate.json') as f:
    data = json.load(f)
pd.json_normalize(data,max_level=0)

gets me

Group1 Group2 Group3 Group4
{'GroupA': {'value': '#fffff', 'type': 'color'... {'GroupA': {'value': '#ff123f', 'type': 'color... {'GroupA': {'value': '#fffff', 'type': 'color'... {'GroupA': {}, 'GroupB': {'value': '#c2c2c2', ...

now i need to transform the all the keys for the next level to rows?!


Solution

  • i don't care about the tool stack

    So, why not use jq, then? It can transform JSON with ease, and even generate your desired output format using the @csv builtin and the --raw-output (or -r) option.

    .Group | to_entries | reduce (
      .[] | (.value | paths(objects) as $p | [$p, getpath($p).value]) + [.key]
    ) as [$path, $value, $column] ({}; ."\($path)"[$column] = $value)
    
    | (keys | map(fromjson) | sort) as $paths
    | ($paths | map(length) | max)  as $width
    | (map(keys[]) | unique)        as $heads
    
    | [(range($width) | null), $heads[]], ($paths[] as $path
      | ($path | .[:-1] |= map(null)) + [range($path | $width - length) | null]
      + [."\($path)"[$heads[]]]
    ) | @csv
    
    ,,,"Group1","Group2","Group3","Group4"
    "GroupA",,,"#fffff","#ff123f","#fffff",
    "GroupB",,,"#c2c2c2","#fffff",,"#c2c2c2"
    "GroupC",,,,,,
    ,"Group_XY",,"#aw13ma",,"#12a3ma",
    "GroupD",,,,"#ababab","#fffff",
    "GroupE",,,,,,
    ,"Group_XX",,,,,
    ,,"Group_A1",,,,"#12a3ma"
    

    Demo


    A personal suggestion: For better readability, I would "merge" those "blank" lines as they provide no additional information, while combining (and if necessary, duplicating) the path items on the row headers. For this, extend paths(objects) to paths(objects | has("value")) to only descend to items of value, and reveal the full path by changing ($path | .[:-1] |= map(null)) to just $path:

    .Group | to_entries | reduce (
      .[] | (.value | paths(objects | has("value")) as $p | [$p, getpath($p).value]) + [.key]
    ) as [$path, $value, $column] ({}; ."\($path)"[$column] = $value)
    
    | (keys | map(fromjson) | sort) as $paths
    | ($paths | map(length) | max)  as $width
    | (map(keys[]) | unique)        as $heads
    
    | [(range($width) | null), $heads[]], ($paths[] as $path
      | $path + [range($path | $width - length) | null]
      + [."\($path)"[$heads[]]]
    ) | @csv
    
    ,,,"Group1","Group2","Group3","Group4"
    "GroupA",,,"#fffff","#ff123f","#fffff",
    "GroupB",,,"#c2c2c2","#fffff",,"#c2c2c2"
    "GroupC","Group_XY",,"#aw13ma",,"#12a3ma",
    "GroupD",,,,"#ababab","#fffff",
    "GroupE","Group_XX","Group_A1",,,,"#12a3ma"
    

    Demo