I have a dataframe with the price fluctuations of the Nasdaq stock index every minute. In trading it is important to take into account data on different time units (to know the short term, medium and long term trends...)
So I used the resample() method of Pandas to get a dataframe with the price in 5 minutes in addition to the original 1 minute:
df1m = pd.DataFrame({
'Time' : ['2022-01-11 09:30:00', '2022-01-11 09:31:00', '2022-01-11 09:32:00', '2022-01-11 09:33:00', '2022-01-11 09:34:00', '2022-01-11 09:35:00', '2022-01-11 09:36:00' , '2022-01-11 09:37:00' , '2022-01-11 09:38:00' ,
'2022-01-11 09:39:00', '2022-01-11 09:40:00'],
'Price' : [1,2,3,4,5,6,7,8,9,10,11]})
df1m['Time'] = pd.to_datetime(df1m['Time'])
df1m.set_index(['Time'], inplace =True)
df5m = df1m.resample('5min').first()
I renamed the column names to 5min :
df5m.rename(columns={'Price' : 'Price5'})
Unfortunately the change of column names is no longer taken into account when the two dataframes (1 and 5 min) are put together:
df_1m_5m = pd.concat([df1m, df5m], axis=1)
How to rename definitively the columns created for the 5min data and avoid having twice the same column name for different data?
You can use:
df5m = df1m.resample('5min').first().add_suffix('5')
df_1m_5m = pd.concat([df1m, df5m], axis=1)
Output:
>>> df_1m_5m
Price Price5
Time
2022-01-11 09:30:00 1 1.0
2022-01-11 09:31:00 2 NaN
2022-01-11 09:32:00 3 NaN
2022-01-11 09:33:00 4 NaN
2022-01-11 09:34:00 5 NaN
2022-01-11 09:35:00 6 6.0
2022-01-11 09:36:00 7 NaN
2022-01-11 09:37:00 8 NaN
2022-01-11 09:38:00 9 NaN
2022-01-11 09:39:00 10 NaN
2022-01-11 09:40:00 11 11.0
You forgot to reassign the result to your dataframe:
df5m = df5m.rename(columns={'Price' : 'Price5'})
# OR
df5m.rename(columns={'Price' : 'Price5'}, inplace=True)
Output:
>>> df5m
Price5
Time
2022-01-11 09:30:00 1
2022-01-11 09:35:00 6
2022-01-11 09:40:00 11