Search code examples
pythonjsonpandasnestedto-json

Nesting a group of columns with a new header in the JSON serialization


I have a Pandas dataframe of this kind

    start    end   compDepth compReleaseDepth compMeanRate
0     0.0   0.62  58.0999985              1.5          110
1    0.66   1.34  57.1399994                3           94
2    1.42    2.1  57.1399994              2.5           89
3    2.21   2.87  58.5699997              2.5           79
4    2.97   3.65  55.2399979              3.5           77
5    3.78   4.45  53.8600006              1.5           76
6    4.49   5.17  62.2700005              0.5           81
7    5.97   6.65  56.1899986              2.5           85

I need to serialise the data into JSON and I used df.to_json(orient='records') and it works fine.

However, I would like to nest the last 3 columns into a new header called "annotations". This is what I want to achieve, is there a simple way to do this?

[{
        "start": "0.0",
        "end": "0.62",
        "annotations": {
            "compDepth": "58.0999985",
            "compReleaseDepth": "1.5",
            "compMeanRate": "110"
        }
    }, {
        "start": "0.66",
        "end": "1.34",
        "annotations": {
            "compDepth": "57.1399994",
            "compReleaseDepth": "3",
            "compMeanRate": "94"
        }
    }, {
        "start": "1.42",
        "end": "2.1",
        "annotations": {
            "compDepth": "57.1399994",
            "compReleaseDepth": "2.5",
            "compMeanRate": "89"
        }
    }, {
        "start": "2.21",
        "end": "2.87",
        "annotations": {
            "compDepth": "58.5699997",
            "compReleaseDepth": "2.5",
            "compMeanRate": "79"
        }
    }, 

Solution

  • One simple way is to nest yourself the data in a new column using to_dict

    df['annotations'] = df[['compDepth','compReleaseDepth','compMeanRate']].to_dict(orient='records')
    

    Then you use to_json(orient='records') only on the 3 columns you want in your final output

    df[['start','end','annotations']].to_json(orient='records')