Search code examples
pythonjsonpandasstringfile-format

Pandas to JSON file formatting issue, adding \ to strings


I am using the pandas.DataFrame.to_json to convert a data frame to JSON data.

data = df.to_json(orient="records")
print(data)

This works fine and the output when printing is as expected in the console.

[{"n":"f89be390-5706-4ef5-a110-23f1657f4aec:voltage","bt":1610040655,"u":"V","v":237.3},
{"n":"f89be390-5706-4ef5-a110-23f1657f4aec:power","bt":1610040836,"u":"W","v":512.3},
{"n":"f89be390-5706-4ef5-a110-23f1657f4aec:voltage","bt":1610040840,"u":"V","v":238.4}]

The problem comes when uploading it to an external API which converts it to a file format or writing it to a file locally. The output has added \ to the beginning and ends of strings.

def dataToFile(processedData):
    with open('data.json', 'w') as outfile:
        json.dump(processedData,outfile)

The result is shown in the clip below

[{\"n\":\"f1097ac5-0ee4-48a4-8af5-bf2b58f3268c:power\",\"bt\":1610024746,\"u\":\"W\",\"v\":40.3},
{\"n\":\"f1097ac5-0ee4-48a4-8af5-bf2b58f3268c:voltage\",\"bt\":1610024751,\"u\":\"V\",\"v\":238.5},
{\"n\":\"f1097ac5-0ee4-48a4-8af5-bf2b58f3268c:power\",\"bt\":1610024764,\"u\":\"W\",\"v\":39.7}]

Is there any formatting specifically I should be including/excluding when converting the data to a file format?


Solution

  • Your data variable is a string of json data and not an actual dictionary. You can do a few things:

    1. Use DataFrame.to_json() to write the file, the first argument of to_json() is the file path:
    df.to_json('./data.json', orient='records')
    
    1. Write the json string directly as text:
    def write_text(text: str, path: str):
        with open(path, 'w') as file:
            file.write(text)
    
    data = df.to_json(orient="records")
    
    write_text(data, './data.json')
    
    1. If you want to play around with the dictionary data:
    def write_json(data, path, indent=4):
        with open(path, 'w') as file: 
            json.dump(data, file, indent=indent)
    
    df_data = df.to_dict(orient='records')
    
    # ...some operations here...
    
    write_json(df_data, './data.json')