Search code examples
javascriptpythonjsonpandasparallel-coordinates

Converting pandas dataframe to custom JSON format (then to JS object)


I would like to plot the result of certain analysis using a parallel coordinates chart. Having found a brilliant example made using protovis http://mbostock.github.io/protovis/ex/cars.html and I am trying to re-arrange my data to replicate the structure of the data file as per the example (cars.js). Hence, my dataframe structure:

                    north   ch  wwr  ach  tmin  tmax  B1_EMS_DH26
Job_ID                                                           
EP_P1203_000000000    0.0  2.5   40  4.0    24    25       1272.2
EP_P1203_000000001    0.0  2.5   40  4.0    24    26       1401.9
EP_P1203_000000002    0.0  2.5   40  4.0    24    27       1642.3

should be converted into the following:

var results = [{
    name: "EP_P1203_000000000",
    north: 0.0,
    ch: 2.5,
    wwr: 40,
    ach: 4.0,
    tmin: 24,
    tmax: 25,
    origin: 1272.2
  },
  {
    name: "EP_P1203_000000001",
    north: 0.0,
    ch: 2.5,
    wwr: 40,
    ach: 4.0,
    tmin: 24,
    tmax: 26,
    origin: 1401.9
  },
  {
    name: "EP_P1203_000000002",
    north: 0.0,
    ch: 2.5,
    wwr: 40,
    ach: 4.0,
    tmin: 24,
    tmax: 27,
    origin: 1272.3
  },
  {
    name: "EP_P1203_000000003",
    north: 0.0,
    ch: 2.5,
    wwr: 40,
    ach: 4.0,
    tmin: 24,
    tmax: 28,
    origin: 1642.3
  },
];

Other than the replacement of my column B1_EMS_DH26 into origin (which seems to be used by the chart to set line colours), I would like not having to slice rows and replacing symbols manually.

Using the dataframe.to_json command returns one line:

{
  "EP_P1203_000000000": {
    "north": 0.0,
    "ch": 2.5,
    "wwr": 40,
    "ach": 4.0,
    "tmin": 24,
    "tmax": 25,
    "B1_EMS_DH26": 1272.2
  },
  "EP_P1203_000000001": {
    "north": 0.0,
    "ch": 2.5,
    "wwr": 40,
    "ach": 4.0,
    "tmin": 24,
    "tmax": 26,
    "B1_EMS_DH26": 1401.9
  },
  "EP_P1203_000000002": {
    "north": 0.0,
    "ch": 2.5,
    "wwr": 40,
    "ach": 4.0,
    "tmin": 24,
    "tmax": 27,
    "B1_EMS_DH26": 1642.3
  }
}

which is still not quite right. How do you suggest to go about this?


Solution

  • Your DataFrame (for recreation-purposes):

    df= pd.DataFrame(
        {'north': [0.0, 0.0, 0.0],
         'B1_EMS_DH26': [1272.2, 1401.9, 1642.3],
         'tmax': [25, 26, 27],
         'wwr': [40, 40, 40],
         'ch': [2.5, 2.5, 2.5],
         'tmin': [24, 24, 24],
         'ach': [4.0, 4.0, 4.0]
         },
        index=['EP_P1203_000000000', 'EP_P1203_000000001', 'EP_P1203_000000002'],
        columns=['north', 'ch', 'wwr', 'ach', 'tmin', 'tmax', 'B1_EMS_DH26'])
    

    This is probably the worst possible way to do this but it works (I think):

    import re
    import json
    
    with open('whatever.json', 'w') as f:
        f.write('var results = [\n')
        for k,v in df.drop('B1_EMS_DH26', axis=1).T.to_dict().items():
            f.write("{name:"+json.dumps(k)+", "+re.sub(r'[{"\']', '', json.dumps(v))+',\n')
        f.write('];')
    

    Produces:

    var results = [{
        name: "EP_P1203_000000001",
        ach: 4.0,
        north: 0.0,
        tmax: 26.0,
        tmin: 24.0,
        ch: 2.5,
        wwr: 40.0
      },
      {
        name: "EP_P1203_000000000",
        ach: 4.0,
        north: 0.0,
        tmax: 25.0,
        tmin: 24.0,
        ch: 2.5,
        wwr: 40.0
      },
      {
        name: "EP_P1203_000000002",
        ach: 4.0,
        north: 0.0,
        tmax: 27.0,
        tmin: 24.0,
        ch: 2.5,
        wwr: 40.0
      },
    ];
    

    It will output a file in the structure I think you're looking for. If not, let me know. Horrible hack I know. Someone with advanced json experience no-doubt knows a better way.