Search code examples
pythonpandas-groupby

Only groupby a key and not performing any other changes


Is it possible to just group rows by key without performing any changes to any other column than the key column going to index ? If yes, how is can we do it ?

df = pd.DataFrame({
            'id': ['A','A','A','B','B','C','C','C','C'],
            'data1': [11,35,46,11,26,25,39,50,55],
            'data2': [1,1,1,1,1,2,2,2,2],      
         })
df

I want a frame where we have ['A', 'B', 'C'] as index and every rows for data1 and data2 stored into index A if id=A, index B if id=B and index C if id=C

something like this :

   data1  data2
A   11      1
    35      1
    46      1
B   11      1
    26      1
C   25      2
    39      2
    50      2
    55      2

Solution

  • Why not set id as index? Like so:

    df = pd.DataFrame({
                'id': ['A','A','A','B','B','C','C','C','C'],
                'data1': [11,35,46,11,26,25,39,50,55],
                'data2': [1,1,1,1,1,2,2,2,2],      
             })
    
    df.set_index(['id'], inplace=True)
    df[df.index.isin(['A'])]
    

    Output 1:

    enter image description here


    Alternatively could create a fake multi index?

    df = pd.DataFrame({
                'id': ['A','A','A','B','B','C','C','C','C'],
                'data1': [11,35,46,11,26,25,39,50,55],
                'data2': [1,1,1,1,1,2,2,2,2],      
             })
    
    ### create empty column
    df['empty'] = ''
    
    ### create multi index
    df.set_index(['id','empty'], inplace=True)
    
    

    enter image description here

    # rename index to none if you dont want index name
    df.index.set_names(None, level=0, inplace=True)
    

    enter image description here

    ### query like this
    df.loc[df.index.get_level_values(0) == 'A']
    

    enter image description here

    ## or like this
    df.loc[df.index.get_level_values(0) == 'A'].droplevel(1)
    

    enter image description here