Assume I have a dataframe:
ID Timestamp Temp Height
01 2016-01-01 09:00:00 17 10
01 2016-01-01 10:15:00 20 5
02 2019-05-23 07:30:00 12 3
01 2012-01-01 09:00:00 15 12
01 2018-01-01 09:00:00 13 18
02 2013-11-23 17:40:00 22 9
What I want is to get the median value of column 'Height' based on 'ID', & month & hour from 'Timestamp'. An example of my desired output:
ID Month Hour Height_median
01 01 09 12
01 01 10 5
02 05 07 3
02 11 17 9
Here you go:
import pandas as pd
df = pd.DataFrame(dict(ID=[1,1,2,1,1,2],
Timestamp=['2016-01-01 09:00:00','2016-01-01 10:15:00','2019-05-23 07:30:00',
'2012-01-01 09:00:00','2018-01-01 09:00:00','2013-11-23 17:40:00'],
Height = [10,5,3,12,18,9]))
df.Timestamp = pd.to_datetime(df.Timestamp)
df['month'] = df.Timestamp.apply(lambda x: x.month)
df['hour'] = df.Timestamp.apply(lambda x: x.hour)
df.groupby(['ID','month','hour'])[['Height']].agg('median').reset_index()