I have a csv file that with rows that looks like this:
745198;2024-09-10 10:09:10.7;leaf-2;{"Accelerometer": {"X": 0.055297852, "Y": 0.993530273, "Z": 0.000244141}}
745199;2024-09-10 10:09:10.71;leaf-2;{"Accelerometer": {"X": 0.056274414, "Y": 0.994384766, "Z": 0.000976563}}
745200;2024-09-10 10:09:10.721;leaf-2;{"Accelerometer": {"X": 0.055786133, "Y": 0.994018555, "Z": 0.000854492}}
745201;2024-09-10 10:09:10.732;leaf-2;{"Accelerometer": {"X": 0.055053711, "Y": 0.993530273, "Z": 0.000854492}}
and I would like to read this data into a dataframe, and somehow get the 3 accelerometer data into separate columns, but I haven't been able to figure out a good way of doing this. I have searched for similar cases Split / Explode a column of dictionaries into separate columns with pandas But none of the suggested solutions seems to work.
I can create a for loop and manually extract the information I need line by line, but I'm guessing there should be a nice fast way of doing this that I just do not know of.
First convert the JSON string in the 'Accelerometer' column to a dictionary:
df = pd.read_csv('file.csv', sep=';', header=None, names=['ID', 'Timestamp', 'Device', 'Accelerometer'])
df['Accelerometer'] = df['Accelerometer'].apply(json.loads)
Then normalize the 'Accelerometer' column to separate columns:
accelerometer_df = pd.json_normalize(df['Accelerometer'])
accelerometer_df.columns = ['X', 'Y', 'Z']
Finally concatenate the original DataFrame with the new accelerometer columns:
result_df = pd.concat([df.drop(columns=['Accelerometer']), accelerometer_df], axis=1)
print(result_df)
The output:
ID Timestamp Device X Y Z
0 745198 2024-09-10 10:09:10.7 leaf-2 0.055298 0.993530 0.000244
1 745199 2024-09-10 10:09:10.71 leaf-2 0.056274 0.994385 0.000977
2 745200 2024-09-10 10:09:10.721 leaf-2 0.055786 0.994019 0.000854
3 745201 2024-09-10 10:09:10.732 leaf-2 0.055054 0.993530 0.000854