Search code examples
pythonarraysjsonparsingflat

Decide JSON data python parsing flat


it's been a while since I've been stuck on a subject to which I can't find the desired solution.

Example: I have a json given like this:

{
    "SECTION": {
        "ID": 1,
        "COMMENT" : "foo bar ",
        "STRUCTURE" : {
            "LIEN" : [
                {
                    "from": "2020-01-01",
                    "to": "2020-01-03"
                },
                {
                    "from": "2020-01-04",
                    "to": "2999-01-07"
                }
            ]
        },
        "CONTEXTE":{
            "NATURE": {
                "text": "lorem smdlk fjq lsjdf mqjsh dflkq hs dfhkq g"
            }
        }

    }
}

I would like to have output, for example this:

{
    "SECTION.ID": 1,
    "SECTION.COMMENT": "foo bar ",
    "SECTION.STRUCTURE.LIEN.from": "2020-01-01",
    "SECTION.STRUCTURE.LIEN.to": "2020-01-03",
    "SECTION.CONTEXTE.NATURE.text": "lorem smdlk fjq lsjdf mqjsh dflkq hs dfhkq g"
}

{
    "SECTION.ID": 1,
    "SECTION.COMMENT": "foo bar ",
    "SECTION.STRUCTURE.LIEN.from": "2020-01-04",
    "SECTION.STRUCTURE.LIEN.to": "2999-01-07",
    "SECTION.CONTEXTE.NATURE.text": "lorem smdlk fjq lsjdf mqjsh dflkq hs dfhkq g"
}

Does anyone have any idea how I can do this in python? Thank you so much


Solution

  • I suggest you use the json Python module to convert the JSON object to a Python object. Then you can use recursion. If you are using Python 3.5 or later, the following code could be a good starting point:

    import json
    
    def flatten_helper(prefix, list_of_dict):
        res = []
        for i in list_of_dict:
            res_dict={}
            for k, v in i.items():
                res_dict['.'.join([prefix,k])]=v
            res.append(res_dict)
        return res
    
    def flatten(x):
        if isinstance(x, list):
            res = []
            for ele in x:
                res = res + flatten(ele)
            return res
        else:
            res = [{}]
            for k, v in x.items():
                if (isinstance(v, dict) or isinstance(v,list)):
                    new_res = []
                    tempo = flatten(v)
                    for r in res:
                        for t in tempo:
                            new_res.append({**r, **t})
                    res = flatten_helper(k,new_res)
                else:
                    for i, val in enumerate(res):
                        res[i][k]=v
            return res
    
    jsonobj = '{"SECTION": {"ID": 1, "COMMENT" : "foo bar ", "STRUCTURE" : { "LIEN" : [{"from": "2020-01-01", "to": "2020-01-03"}, {"from": "2020-01-04", "to": "2999-01-07" }]}, "CONTEXTE":{"NATURE": {"text": "lorem smdlk fjq lsjdf mqjsh dflkq hs dfhkq g"}}}}'
    
    pyobj = json.loads(jsonobj)
    res = flatten(pyobj)