Search code examples
pandasdataframesumnanzero

Sum of NaNs to equal NaN (not zero)


I can add a TOTAL column to this DF using df['TOTAL'] = df.sum(axis=1), and it adds the row elements like this:

   col1  col2  TOTAL
0   1.0   5.0    6.0
1   2.0   6.0    8.0
2   0.0   NaN    0.0
3   NaN   NaN    0.0

However, I would like the total of the bottom row to be NaN, not zero, like this:

   col1  col2  TOTAL
0   1.0   5.0    6.0
1   2.0   6.0    8.0
2   0.0   NaN    0.0
3   NaN   NaN    Nan

Is there a way I can achieve this in a performant way?


Solution

  • Add parameter min_count=1 to DataFrame.sum:

    min_count : int, default 0
    The required number of valid values to perform the operation. If fewer than min_count non-NA values are present the result will be NA.

    New in version 0.22.0: Added with the default being 0. This means the sum of an all-NA or empty Series is 0, and the product of an all-NA or empty Series is 1.

    df['TOTAL'] = df.sum(axis=1, min_count=1)
    print (df)
       col1  col2  TOTAL
    0   1.0   5.0    6.0
    1   2.0   6.0    8.0
    2   0.0   NaN    0.0
    3   NaN   NaN    NaN