Search code examples
pythonpandasdataframemulti-index

Multiply each value in DataFrame cell according to multi-index name


Given this pandas Dataframe

list_index = pd.Series(['A' for i in range(2)] + ['B' for i in range(4)] + ['C' for i in range(3)] + ['D' for i in range(6)], name='indexes')
list_type = pd.Series(['a', 'c'] + ['a', 'b','c','d'] + ['f','g','i'] + ['a','c','d','e','f','g'], name='types')
df = pd.DataFrame({
    'value' : [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15]
}, index=[list_index, list_type])

indexes types   value
A       a       1
        c       2
B       a       3
        b       4
        c       5
        d       6
C       f       7
        g       8
        i       9
D       a       10
        c       11
        d       12
        e       13
        f       14
        g       15

I want to multiply each value by a factor (aka ratio) contained in another pandas.Dataframe

ratio_df = pd.DataFrame({
    'ratio' : [0.1, 0.2, 0.4, 0.5]
}, index=['A', 'B', 'C', 'D'])

    ratio
A   0.1
B   0.2
C   0.4
D   0.5

So that all values in df with 'indexes' == 'A' are multiplied by 0.1, and values with 'indexes' == 'B' are multiplied by 0.2 and so on.
I'm sure there is some smart way to do that but right now I can't really think of it. I know I can 'expand' ratio_df to the same length of df (with reset_index() and then creating a new column for df including ratios) and than simply perform * operation pairwise, but I'm not sure that's the fastest method.

I also looked at this answer but it's slightly different from my case.


Solution

  • If just needing the product of the two columns Series.mul can be aligned based on index level.

    Just select the columns and mul on index level:

    df['value'].mul(ratio_df['ratio'], level='indexes')
    

    or with index level number:

    df['value'].mul(ratio_df['ratio'], level=0)
    

    The result is an unnamed Series:

    indexes  types
    A        a        0.1
             c        0.2
    B        a        0.6
             b        0.8
             c        1.0
             d        1.2
    C        f        2.8
             g        3.2
             i        3.6
    D        a        5.0
             c        5.5
             d        6.0
             e        6.5
             f        7.0
             g        7.5
    dtype: float64
    

    The resulting Series can be assigned back to df as needed:

    df['new'] = df['value'].mul(ratio_df['ratio'], level='indexes')
    

    df:

                   value  new
    indexes types            
    A       a          1  0.1
            c          2  0.2
    B       a          3  0.6
            b          4  0.8
            c          5  1.0
            d          6  1.2
    C       f          7  2.8
            g          8  3.2
            i          9  3.6
    D       a         10  5.0
            c         11  5.5
            d         12  6.0
            e         13  6.5
            f         14  7.0
            g         15  7.5