Search code examples
pythonpandasdataframepivot-tableaggregate

Pandas pivot_table drops values for aggregate functions that return all None, even if dropna argument is False


The dropna=False argument of pandas pivot_table is not doing what I expect it to do. I'm starting with the following DataFrame:

import pandas as pd
# pd.__version__ = '0.23.3'

df = pd.DataFrame({'fruit':['apple','orange','peach','apple','peach'], 
    'size': [1,1.5,1,2,1],
    'taste': [7,8,6,6,10]})

...and the following aggregate functions:

def return_one(x): 
    return 1

def return_sum(x):
    return sum(x)

def return_none(x):
    return None 

I would like to reshape df so that the:

  • index is made up of the original column names ('size','taste').
  • values are the result of the aggregate functions applied to values within a group.
  • columns are the fruit types ('apple', 'orange', 'peach')

The following pivot_table is almost there:

pd.pivot_table(df, columns='fruit', 
               aggfunc=[return_sum,return_none,return_one], 
               dropna=False)

values returned by return_none have been dropped

Except that the output of return_none has been dropped. I would expect the pivoted table to include a set of empty cells for the return_none function because dropna=False, but it does not. Can anyone suggest how to include the output of return_none?


Solution

  • This issue was raised on the pandas repository and the fix was implemented in April 2019. In current versions of Pandas, the pivot_table Dropna argument is respected when it is False:

    import pandas as pd
    # pd.__version__ is '1.5.0'
    
    df = pd.DataFrame({'fruit': ['apple','orange','peach','apple','peach'], 
        'size': [1, 1.5, 1, 2, 1],
        'taste': [7, 8, 6, 6, 10]})
    
    def return_one(x): 
        return 1
    
    def return_sum(x):
        return sum(x)
    
    def return_none(x):
        return None 
    
    pd.pivot_table(df, columns='fruit', 
                   aggfunc=[return_sum, return_none, return_one], 
                   dropna=False)
    

    Returns:

          return_sum              return_none              return_one             
    fruit      apple orange peach       apple orange peach      apple orange peach
    size         3.0    1.5   2.0        None   None  None          1      1     1
    taste       13.0    8.0  16.0        None   None  None          1      1     1