Search code examples
pythonpandasmulti-index

How to multiply each column by all the columns of another df, obtaining a multi-index?


I need to split each column according to percentages found in another df. E.g.:

>>> import pandas as pd
>>> 
>>> things = ['some thing', 'another thing']
>>> 
>>> amount = pd.DataFrame({2019: [10, 20], 2020: [100, 200]}, index=things)
>>> amount
               2019  2020
some thing       10   100
another thing    20   200
>>> 
>>> split = pd.DataFrame({'first': [0.2, 0.9], 'second': [0.8, 0.1]}, index=things)
>>> split
               first  second
some thing       0.2     0.8
another thing    0.9     0.1
>>> 
>>> result = amount ??? split  # how to do this?
>>> result
               2019         2020       
              first second first second
some thing        2      8    20     80
another thing    18      2   180     20

How can I do this in pandas in one simple shot?


Solution

  • You can use pd.concat() with list comprehension for cross products of the (2 x 2) Series from the 2 dataframes, as follows:

    2 list comprehensions:

    [amount[i] * split[j] for i in amount.columns for j in split.columns] for cross product of the (2 x 2) Series

    [(x, y) for x in amount.columns for y in split.columns] for column index

    result = pd.concat([amount[i] * split[j] for i in amount.columns for j in split.columns],
                       keys=[(x, y) for x in amount.columns for y in split.columns], axis=1)
    
    
    
    
    print(result)
    
                   2019          2020       
                  first second  first second
    some thing      2.0    8.0   20.0   80.0
    another thing  18.0    2.0  180.0   20.0
    

    If you are sure the resulting values will be integers and want the values be integers, you can further convert its type by .astype(int):

    result = pd.concat([amount[i] * split[j] for i in amount.columns for j in split.columns],
                       keys=[(x, y) for x in amount.columns for y in split.columns], axis=1).astype(int)
    
    
    
    
    print(result)
    
                   2019         2020       
                  first second first second
    some thing        2      8    20     80
    another thing    18      2   180     20