Search code examples
pythonpandasdataframemulti-index

Setting to second level of Pandas multi-index results in NaN


I am trying to set values to a DataFrame for a specific subset of a multi-index and instead of the values being set I am just getting NaN values.

Here is an example:

df_test = pd.DataFrame(np.ones((10,2)),index = pd.MultiIndex.from_product([['even','odd'],[0,1,2,3,4]],names = ['parity','mod5']))
df_test.loc[('even',),1] = pd.DataFrame(np.arange(5)+5,index = np.arange(5))
df_test
               0    1
parity mod5          
even   0     1.0  NaN
       1     1.0  NaN
       2     1.0  NaN
       3     1.0  NaN
       4     1.0  NaN
odd    0     1.0  1.0
       1     1.0  1.0
       2     1.0  1.0
       3     1.0  1.0
       4     1.0  1.0

whereas I expected the following output:

               0    1
parity mod5          
even   0     1.0  5.0
       1     1.0  6.0
       2     1.0  7.0
       3     1.0  8.0
       4     1.0  9.0
odd    0     1.0  1.0
       1     1.0  1.0
       2     1.0  1.0
       3     1.0  1.0
       4     1.0  1.0

What do I need to do differently to get the expected result? I have tried a few other things like df_test.loc['even']['1'] but that doesn't even affect the DataFrame at all.


Solution

  • In this example, your indices are specially ordered. If you need to do something like this when index matching matters but the ordering of your DataFrame indices is not guaranteed, then this may be accomplished via DataFrame.update like this:

    index = np.arange(5)
    np.random.shuffle(index)
    df_other = pd.DataFrame(np.arange(5) + 5, index=index).squeeze()
    df_test.loc[('even',), 1].update(df_other)
    

    The .squeeze() is needed to convert the DataFrame into a Series (whose shape and indices match those of df_test.loc[('even',), 1]).