Search code examples
pandasdataframenandrop

Drop NaN's from row that totals one column


I have looked high and low and tried so many different codes from this site to help me with my problem. Maybe someone can make a suggestion?

I have a dataframe that looks like this: image of my dataframe

I hope that table came out right. I'm a newbie to Stack Overflow so sorry if it didn't come out right. I have struggled with this for hours. I managed to finally show my Total row at the bottom, but I don't want the NaN to show in the one column that has strings in it. Can someone tell me what on EARTH does it take to simply remove NaN's from ONE CELL in this dataframe? I'm at my wits end.


Solution

  • One of possible solutions (including creation of the dataframe):

    import pandas as pd
    import numpy as np
    
    # create base of the dataframe
    df = pd.DataFrame({'gender':['male', 'female', 'others'], 'total':[484, 81, 11]})
    # calculate percentage column
    df['percentage'] = round(df['total']/df['total'].sum(), 2)
    # create SUM row
    df.loc['TOTAL'] = df.select_dtypes(np.number).sum()
    # replace string column 'gender' with empty string
    df.loc['TOTAL', 'gender'] = ''
    

    Result:

            gender  total   percentage
    0       male    484.0   0.84
    1       female  81.0    0.14
    2       others  11.0    0.02
    TOTAL           576.0   1.00