Search code examples
excelpandasdata-analysis

how to get unique rows with some conditions using pandas or excel


my table is like this


Now
Id , age , gender , IsDiabetes, Is Obese, IsHeart
 1    23    Female     Yes         No      Yes 
 1    23    Female     Yes         Yes      Yes 
 2    60    Male       Yes         No       No 
 3    70    Female     No          No       Yes
 3    70    Female     No          Yes       Yes

My desired outcome is if there is Yes put it insted of No I do not need unuseful duplication

Desired 
Id , age , gender , IsDiabetes, Is Obese, IsHeart
 1    23    Female     Yes         Yes      Yes 
 2    60    Male       Yes         No       No 
 3    70    Female     No          Yes       Yes

I tried


data1 =df.groupby('ID').agg(list)

but this is not the best solution for me. the results not working when i opened the excel file.


Solution

  • You can aggregate max because lexicographically is Yes greater like No:

    data1 = df.groupby(['Id','age','gender'], as_index=False).max()
    print (data1)
       Id  age  gender IsDiabetes IsObese IsHeart
    0   1   23  Female        Yes     Yes     Yes
    1   2   60    Male        Yes      No      No
    2   3   70  Female         No     Yes     Yes