Search code examples
pythonpandasto-json

Pandas: Write dataframe to json with split


I need write data to json. Problem is that I can't establish delimiter to string. My df looks like

     id       date  val
0   123 2015-12-12    1
1   123 2015-12-13    1
2   123 2015-12-14    0
3   123 2015-12-15    1
4   123 2015-12-16    1
5   123 2015-12-17    0
6   123 2015-12-18    1
7   456 2015-12-12    1
8   456 2015-12-13    1
9   456 2015-12-14    0
10  456 2015-12-15    1

I use

df.groupby('id').apply(lambda x: x.set_index('date')['val'].to_dict()).to_json('nielsen', orient='index')

and I want to get smth like

{
"1234567890abcdef1234567890abcdef": {
    "2016-06": 1, 
    "2016-05": 0, 
    "2016-04": 0, 
    "2016-03": 1, 
    "2016-02": 1, 
    "2016-01": 0
}, 
"0987654321abcdef1234567890abcdef": {
    "2016-06": 1, 
    "2016-05": 1, 
    "2016-04": 1, 
    "2016-03": 0, 
    "2016-02": 0, 
    "2016-01": 0
}
}

How can I do that?


Solution

  • You could have to_json write to a StringIO object and then use json loads/dumps to format to your liking:

    import pandas as pd
    import StringIO, json
    df = pd.read_csv('data.csv')
    nielson = StringIO.StringIO()
    df.groupby('id').apply(lambda x: x.set_index('date')['val'].to_dict()).to_json(nielson, orient='index')
    print(json.dumps(json.loads(nielson.getvalue()),indent=2))
    

    This produces:

    {
      "123": {
        "2015-12-14": 0, 
        "2015-12-15": 1, 
        "2015-12-16": 1, 
        "2015-12-17": 0, 
        "2015-12-12": 1, 
        "2015-12-13": 1, 
        "2015-12-18": 1
      }, 
      "456": {
        "2015-12-14": 0, 
        "2015-12-15": 1, 
        "2015-12-12": 1, 
        "2015-12-13": 1
      }
    }
    

    See help(json.dumps) for other formatting options. See help(json.dump) for details on how to write to a file (basic example shown below):

    with open('nielsen','w') as f:
        json.dump(json.loads(nielson.getvalue()), f, indent=2)