Search code examples
pythonpandasdataframedata-analysis

Removing duplicates with ignoring case sensitive and adding the next column values with the first one in pandas dataframe in python


I have a df,

Name    Count
Ram     1
ram     2
raM     1
Arjun   3
arjun   4

My desired output df,

Name    Count
Ram     4
Arjun   7

I tried groupby but I cannot achieve the desired output, please help


Solution

  • Use agg by values of Names converted to lower - first and sum:

    df = (df.groupby(df['Name'].str.lower(), as_index=False, sort=False)
            .agg({'Name':'first', 'Count':'sum'}))
    print (df)
        Name  Count
    0    Ram      4
    1  Arjun      7
    

    Detail:

    print (df['Name'].str.lower())
    0      ram
    1      ram
    2      ram
    3    arjun
    4    arjun
    Name: Name, dtype: object