Given this pandas Dataframe
list_index = pd.Series(['A' for i in range(2)] + ['B' for i in range(4)] + ['C' for i in range(3)] + ['D' for i in range(6)], name='indexes')
list_type = pd.Series(['a', 'c'] + ['a', 'b','c','d'] + ['f','g','i'] + ['a','c','d','e','f','g'], name='types')
df = pd.DataFrame({
'value' : [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15]
}, index=[list_index, list_type])
indexes types value
A a 1
c 2
B a 3
b 4
c 5
d 6
C f 7
g 8
i 9
D a 10
c 11
d 12
e 13
f 14
g 15
I want to multiply each value by a factor (aka ratio) contained in another pandas.Dataframe
ratio_df = pd.DataFrame({
'ratio' : [0.1, 0.2, 0.4, 0.5]
}, index=['A', 'B', 'C', 'D'])
ratio
A 0.1
B 0.2
C 0.4
D 0.5
So that all values in df
with 'indexes' == 'A' are multiplied by 0.1, and values with 'indexes' == 'B' are multiplied by 0.2 and so on.
I'm sure there is some smart way to do that but right now I can't really think of it. I know I can 'expand' ratio_df to the same length of df (with reset_index() and then creating a new column for df including ratios) and than simply perform *
operation pairwise, but I'm not sure that's the fastest method.
I also looked at this answer but it's slightly different from my case.
If just needing the product of the two columns Series.mul can be aligned based on index level.
Just select the columns and mul
on index level:
df['value'].mul(ratio_df['ratio'], level='indexes')
or with index level number:
df['value'].mul(ratio_df['ratio'], level=0)
The result is an unnamed Series:
indexes types
A a 0.1
c 0.2
B a 0.6
b 0.8
c 1.0
d 1.2
C f 2.8
g 3.2
i 3.6
D a 5.0
c 5.5
d 6.0
e 6.5
f 7.0
g 7.5
dtype: float64
The resulting Series can be assigned back to df
as needed:
df['new'] = df['value'].mul(ratio_df['ratio'], level='indexes')
df
:
value new
indexes types
A a 1 0.1
c 2 0.2
B a 3 0.6
b 4 0.8
c 5 1.0
d 6 1.2
C f 7 2.8
g 8 3.2
i 9 3.6
D a 10 5.0
c 11 5.5
d 12 6.0
e 13 6.5
f 14 7.0
g 15 7.5