Search code examples
pythonjsonpandasdataframedata-conversion

Backward slash when converting dataframe to json file


I tried to convert pandas dataframe to json. But a column in dataframe have path like structure (Example:/disk/folder/folder/file.txt). Before each / forward slash, backward slash \ is occuring. How to avoid backward slash.

My dataframe:

    path                            time    content

0   /disk/folder/folder/file.txt    3.0     नमस्ते
1   /disk/folder/folder/file1.txt   4.0     नमस्ते

My code to convert pandas dataframe to json

with open('temp.json', 'w', encoding='utf-8') as file:
    df.to_json(file, orient = 'records' ,force_ascii=False, lines=True)

Output Json file

{"path":"\/disk\/folder\/folder\/file.txt","time":3.0,"content":"नमस्ते"}
{"path":"\/disk\/folder\/folder\/file1.txt","time":4.0,"content":"नमस्ते"}

Output Expected

{"path":"/disk/folder/folder/file.txt","time":3.0,"content":"नमस्ते"}
{"path":"/disk/folder/folder/file1.txt","time":4.0,"content":"नमस्त"}

Help me out in this problem.


Solution

  • That's alright. The backslash is used to escape a string. It's possible that a forward slash has some special meaning in JSON or pandas, so it's escaped using a backslash. It should be alright if you load/use that JSON back.


    EDIT 1

    Interestingly, this does not appear to happen without Pandas.

    from json import dumps
    print(dumps({"path": "/disk/folder/folder/file.txt", "time": 3.0, "content": "नमस्ते"}, ensure_ascii=False))
    

    Output:

    {"path": "/disk/folder/folder/file.txt", "time": 3.0, "content": "\u0928\u092e\u0938\u094d\u0924\u0947"}
    

    EDIT 2

    Looks like someone beat me to the answer: Forward slash in json file from pandas dataframe


    EDIT 3 - Full solution by OP for easier understanding

    dataframe

        path                            time    content
    
    0   /disk/folder/folder/file.txt    3.0     नमस्ते
    1   /disk/folder/folder/file1.txt   4.0     नमस्ते
    

    dataframe to dictionary of records

    dict_records = dataframe.to_dict('records')
    
    dict_records
    [{'path': '/disk/folder/folder/file.txt', 'time': 3.0, 'content': 'नमस्ते'},
     {'path': '/disk/folder/folder/file1.txt', 'time': 4.0, 'content': 'नमस्ते'}]
    

    dictionary to json by dumping using ndjson library

    import ndjson
    with open('sample.json', 'w') as f:
        ndjson.dump(dict_records, f,ensure_ascii=False)
    

    sample.json

    {"path": "/disk/folder/folder/file.txt", "time": 3.0, "content": "नमस्ते"}
    {"path": "/disk/folder/folder/file1.txt", "time": 4.0, "content": "नमस्ते"}
    

    If you want to install ndjson - refer ndjson Package