Search code examples
pythonpandasdataframegroup-bypandas-apply

Pandas : Prevent groupby-apply to sort the results according to index


Say I have a dataframe,

dict_ = {
    'Query' : ['apple', 'banana', 'mango', 'bat', 'cat', 'rat', 'lion', 'potato', 'london', 'new jersey'],
    'Category': ['fruits', 'fruits', 'fruits', 'animal', 'animal', 'animal', 'animal', 'veggie', 'place', 'place'],
    }
df = pd.DataFrame(dict_)

After replicating the groups based on some logic, I see the resultant dataframe has been sorted by the index value.

rep_val= df.groupby('Category').size().max()
df.groupby('Category').apply(lambda d: pd.concat(([d]*math.ceil(rep_val/d.shape[0]))).head(rep_val)).reset_index(drop=True)


         Query Category
0          bat   animal
1          cat   animal
2          rat   animal
3         lion   animal
4        apple   fruits
5       banana   fruits
6        mango   fruits
7        apple   fruits
8       london    place
9   new jersey    place
10      london    place
11  new jersey    place
12      potato   veggie
13      potato   veggie
14      potato   veggie
15      potato   veggie

However expected was the group appearing first should appear first in the dataframe as well, so my categories would be fruits, animal, veggie, place and their corresponding values

Expected output :

         Query Category
0        apple   fruits
1       banana   fruits
2        mango   fruits
3        apple   fruits
4          bat   animal
5          cat   animal
6          rat   animal
7         lion   animal
8       potato   veggie
9       potato   veggie
10      potato   veggie
11      potato   veggie
12      london    place
13  new jersey    place
14      london    place
15  new jersey    place

Solution

  • Set sort=False for the groupby method

    CODE

    rep_val = df.groupby('Category', sort=False).size().max()
    df = df.groupby('Category', sort=False).apply(lambda d: pd.concat(([d] * math.ceil(rep_val / d.shape[0]))).head(rep_val)).reset_index(drop=True)
    

    OUTPUT

             Query Category
    0        apple   fruits
    1       banana   fruits
    2        mango   fruits
    3        apple   fruits
    4          bat   animal
    5          cat   animal
    6          rat   animal
    7         lion   animal
    8       potato   veggie
    9       potato   veggie
    10      potato   veggie
    11      potato   veggie
    12      london    place
    13  new jersey    place
    14      london    place
    15  new jersey    place