Search code examples
pythonpandasdataframecsvexport-to-excel

How to use groupby in a dataframe and output only unique group names but contains all rows?


I have data as follows:

country     state        area    people
India       Tamil Nadu   urban   194
India       Karnataka    urban   96
Sri Lanka   state1       urban   600
India       Chennai      urban   302
Sri Lanka   state2       urban   213
India       Bengaluru    rural   34
Sri Lanka   state3       rural   173
U.S.A       NYC          urban   300
Germany     Scehltr      rural   87
India       west Bengal  urban   902
Sri Lanka   State 9      rural   102

And I want to group this data as here:

country     state        area    people
India       Tamil Nadu   urban   194
            Karnataka    urban   96
            Bengaluru    rural   34
            Chennai      urban   302
Sri Lanka   state1       urban   600
            state2       urban   213
            state3       rural   173

And I want to insert this data into a csv file in the same exact format, such that when the csv file is opened, I would get the data as in this above output.

I will be using Python Pandas for this task.


Solution

  • Filter only duplicated values by country column and then sorting by same column, last repalce duplicated values to empty strings:

    df = df[df.duplicated(['country'], keep=False)].sort_values('country', ignore_index=True)
    df.loc[df.duplicated(subset=['country']), 'country'] = ''
    print (df)
         country        state   area  people
    0      India   Tamil Nadu  urban     194
    1               Karnataka  urban      96
    2                 Chennai  urban     302
    3               Bengaluru  rural      34
    4             west Bengal  urban     902
    5  Sri Lanka       state1  urban     600
    6                  state2  urban     213
    7                  state3  rural     173
    8                 State 9  rural     102
    

    If use MultiIndex in file are values repeated:

    df = df.set_index(['country','state'])
    print (df)
                            area  people
    country   state                     
    India     Tamil Nadu   urban     194
              Karnataka    urban      96
              Chennai      urban     302
              Bengaluru    rural      34
              west Bengal  urban     902
    Sri Lanka state1       urban     600
              state2       urban     213
              state3       rural     173
              State 9      rural     102