Search code examples
pythonpandasmulti-index

Create new column for in multiindex dataframe and fillna


Let's say I have the following multi-indexed dataframe, generated with the following code:

import pandas as pd, numpy as np

names = ['Name1','Name2','Name3','Name4']
values = ['x1','x2','x3','x4']
categories = ['y1','y2','y3']

x1 = pd.Series([0, 0, 0], index=categories)

index = pd.MultiIndex.from_product([names, values]); placeholders = np.zeros((len(names)*len(values), len(categories)))

df = pd.DataFrame(placeholders, index=index, columns=categories)

for i in names:
    for j in values:
        df.loc[i,j] = x1

           y1   y2   y3
Name1 x1  0.0  0.0  0.0
      x2  0.0  0.0  0.0
      x3  0.0  0.0  0.0
      x4  0.0  0.0  0.0
Name2 x1  0.0  0.0  0.0
      x2  0.0  0.0  0.0
      x3  0.0  0.0  0.0
      x4  0.0  0.0  0.0
Name3 x1  0.0  0.0  0.0
      x2  0.0  0.0  0.0
      x3  0.0  0.0  0.0
      x4  0.0  0.0  0.0
Name4 x1  0.0  0.0  0.0
      x2  0.0  0.0  0.0
      x3  0.0  0.0  0.0
      x4  0.0  0.0  0.0

How would I use a dictionary such as the following to fill the column y3 corresponding to row x1 and filling np.nan for values for any other rows in the y3 column or when a given Name (i.e. Name1, Name2, Name3, etc.) is not a key in the dictionary?

{'Name1': 54, 'Name3': 50}

Expected output (0's could be np.nan):

           y1   y2   y3
Name1 x1  0.0  0.0   54
      x2  0.0  0.0  0.0
      x3  0.0  0.0  0.0
      x4  0.0  0.0  0.0
Name2 x1  0.0  0.0  0.0
      x2  0.0  0.0  0.0
      x3  0.0  0.0  0.0
      x4  0.0  0.0  0.0
Name3 x1  0.0  0.0   50
      x2  0.0  0.0  0.0
      x3  0.0  0.0  0.0
      x4  0.0  0.0  0.0
Name4 x1  0.0  0.0  0.0
      x2  0.0  0.0  0.0
      x3  0.0  0.0  0.0
      x4  0.0  0.0  0.0

Solution

  • The idiomatic way is probably to use update

    In [31]: df2 = pd.DataFrame({(k,'x1'): {'y3': v} for k, v in d.items()}).T
    
    In [32]: df2
    Out[32]: 
              y3
    Name1 x1  54
    Name3 x1  50
    
    In [33]: df.update(df2)
    
    In [34]: df
    Out[34]: 
               y1   y2    y3
    Name1 x1  0.0  0.0  54.0
          x2  0.0  0.0   0.0
          x3  0.0  0.0   0.0
          x4  0.0  0.0   0.0
    Name2 x1  0.0  0.0   0.0
          x2  0.0  0.0   0.0
          x3  0.0  0.0   0.0
          x4  0.0  0.0   0.0
    Name3 x1  0.0  0.0  50.0
          x2  0.0  0.0   0.0
          x3  0.0  0.0   0.0
          x4  0.0  0.0   0.0
    Name4 x1  0.0  0.0   0.0
          x2  0.0  0.0   0.0
          x3  0.0  0.0   0.0
          x4  0.0  0.0   0.0