I have a long table with many rows and my goal is to sort this table out to conduct further analysis on the different parameters.
The df looks like this:
datetime value type description name
0 2018-01-01 00:30:00+01:00 0.22 HLN HigLowNot NO2
1 2018-01-01 00:30:00+01:00 0.31 HLN HighLowNot CO
2 2018-01-01 00:30:00+01:00 1.15 HLN HighLowNot NO
3 2018-01-01 00:30:00+01:00 1.80 AIS AllinSep NO2
4 2018-01-01 00:30:00+01:00 2.60 AIS AllinSep CO
5 2018-01-01 00:30:00+01:00 2.30 AIS AllinSep NO
This is a short form, there are 20 unique names and 2 types.
This is what I do currently:
I sort by typ HLN h = df[df['type'] == 'HLN']
and get that specific type in one lon table.
After that I create for each name a subset and after that a pivot table. I want to automate both parts because they have 40 lines in total. Ist that possible?
h_NO2 = h[h['name'] == 'NO2']
h_NO = h[h['name'] == 'NO']
h_CO = h[h['name'] == 'CO']
h_NO2_subset = h_NO2.pivot(index ="datetime", columns="description", values = "value")
h_NO_subset = h_NO.pivot(index ="datetime", columns="description", values = "value")
h_CO_subset = h_CO.pivot(index ="datetime", columns="description", values = "value")
Use a dictionary as container:
dict_h = {i: h[h['name'] == i]
for i in h['name'].unique()
}
dict_h_subset = {k: v.pivot(index="datetime",
columns="description",
values="value")
for k,v in dict_h.items()
}
Then access the results by key:
dict_h_subset['NO2']
output:
>>> dict_h
{'NO2': datetime value type description name
0 2018-01-01 00:30:00+01:00 0.22 HLN HigLowNot NO2
3 2018-01-01 00:30:00+01:00 1.80 AIS AllinSep NO2,
'CO': datetime value type description name
1 2018-01-01 00:30:00+01:00 0.31 HLN HighLowNot CO
4 2018-01-01 00:30:00+01:00 2.60 AIS AllinSep CO,
'NO': datetime value type description name
2 2018-01-01 00:30:00+01:00 1.15 HLN HighLowNot NO
5 2018-01-01 00:30:00+01:00 2.30 AIS AllinSep NO}
>>> dict_h_subset
{'NO2': description AllinSep HigLowNot
datetime
2018-01-01 00:30:00+01:00 1.8 0.22,
'CO': description AllinSep HighLowNot
datetime
2018-01-01 00:30:00+01:00 2.6 0.31,
'NO': description AllinSep HighLowNot
datetime
2018-01-01 00:30:00+01:00 2.3 1.15}