I have created a dataframe with a MultiIndex like below:
import numpy as np
import pandas as pd
column_index= [np.array(['OPEN','OPEN','CLOSE','CLOSE']),np.array(['IBM','AAPL','IBM','AAPL'])]
df = pd.DataFrame(np.transpose(np.array([[1,2,3],[4,5,6],[7,8,9],[10,11,12]])),index=['20190101','20190102','20190103'],columns=column_index)
The result is like this:
OPEN CLOSE
IBM AAPL IBM AAPL
20190101 1 4 7 10
20190102 2 5 8 11
20190103 3 6 9 12
Now I'd like to create a new set of columns by doing something like:
df['RTN'] = df.CLOSE / df.OPEN
To get:
OPEN CLOSE RTN
IBM AAPL IBM AAPL IBM AAPL
20190101 1 4 7 10 7.0 2.5
20190102 2 5 8 11 4.0 2.2
20190103 3 6 9 12 3.0 2.0
That does not work. The nicest way I've been able to do this is like so:
rtn = df.CLOSE / df.OPEN
rtn = pd.concat([rtn],keys=['RTN'],axis=1)
df = pd.concat([df,rtn],axis=1)
Is there a way to do this as an assignment without the other steps?
One way is to rename
the columns prior to the operations. Then it's a simple concat:
u = df.loc[:, ['CLOSE']].rename(columns={'CLOSE': 'RTN'}, level=0).divide(
df.loc[:, ['OPEN']].rename(columns={'OPEN': 'RTN'}, level=0))
# [] DataFrame selection keeps MultiIndex
pd.concat([df, u], axis=1)
Alternatively, you can stack
+ eval
+ unstack
. It's concise, but perhaps not super performant for large datasets.
df.stack().eval('RTN = CLOSE/OPEN').unstack()
#df.stack().assign(RTN = lambda x: x.CLOSE/x.OPEN).unstack()
Without eval
:
df.stack().assign(RTN = lambda x: x.CLOSE/x.OPEN).unstack()
#or
df = df.stack()
df['RTN'] = df.CLOSE/df.OPEN
df = df.unstack()
Output in all cases:
OPEN CLOSE RTN
IBM AAPL IBM AAPL IBM AAPL
20190101 1 4 7 10 7.0 2.5
20190102 2 5 8 11 4.0 2.2
20190103 3 6 9 12 3.0 2.0