Search code examples
python-3.xpandashierarchy

How to divide multilevel columns in Python


I have a df like this:

arrays = [['bar', 'bar', 'baz', 'baz'],
          ['one', 'two', 'one', 'two']]

tuples = list(zip(*arrays))

index = pd.MultiIndex.from_tuples(tuples, names=['first', 'second'])
df = pd.DataFrame(np.random.randn(3, 4), index=['A', 'B', 'C'], columns=index)
df.head()

returning:

enter image description here

I want to add some columns where all second level dimensions are divided by each other - bar one is divided by baz one, and bar two is divided by baz two, etc.

df[["bar"]]/df[["baz"]]

and

df[["bar"]].div(df[["baz"]])

returns NaN's


Solution

  • You can select both levels by only one []:

    df1 = df["bar"]/df["baz"]
    print (df1)
    second        one        two
    A        1.564478  -0.115979
    B       14.604267 -19.749265
    C       -0.511788  -0.436637
    

    If want add MultiIndex add MultiIndex.from_product:

    df1.columns = pd.MultiIndex.from_product([['new'], df1.columns], names=df.columns.names)
    print (df1)
    first         new           
    second        one        two
    A        1.564478  -0.115979
    B       14.604267 -19.749265
    C       -0.511788  -0.436637
    

    Another idea for MultiIndex in output is use your solution with rename columns to same names, here new:

    df2 = df[["bar"]].rename(columns={'bar':'new'})/df[["baz"]].rename(columns={'baz':'new'})
    print (df2)
    first         new           
    second        one        two
    A        1.564478  -0.115979
    B       14.604267 -19.749265
    C       -0.511788  -0.436637