Search code examples
pandassortinggroup-by

categorical sorting within date group in dataframe


I have a dataframe like this one and would like to make sure to sort it by date and have the same ordering for side for status emtries, I would like to see B entries on top before S for each date. But also keep the dates where there is only one entry.

tempDF = pd.DataFrame({ 'id': [12,12,12,12,45,45,45,51,51,51,51,51,51,76,76,76],
                'date': ['2015-05-01','2015-05-22','2015-05-14','2015-05-06','2015-05-03','2015-05-12','2015-05-02','2015-05-05','2015-05-01','2015-05-23','2015-05-17','2015-05-03','2015-05-05','2015-05-04','2015-05-22','2015-05-08'],
                'status': ['B','S','B','S','B','B','S','B','S','B','B','S','S','B','B','S']})
tempDF['date'] = pd.to_datetime(tempDF['date'])

I tried

tempDF.sort_values(['status']).groupby(tempDF['date']).head(2)

but this ends up the entries sorted before being grouped


Solution

  • If I understand correctly, first sort_values then grouby.head with sort=False:

    (tempDF.sort_values(by=['date', 'status'], ascending=True)
           .groupby('date', sort=False).head(2)
    )
    

    Output:

        id       date status
    0   12 2015-05-01      B
    8   51 2015-05-01      S
    6   45 2015-05-02      S
    4   45 2015-05-03      B
    11  51 2015-05-03      S
    13  76 2015-05-04      B
    7   51 2015-05-05      B
    12  51 2015-05-05      S
    3   12 2015-05-06      S
    15  76 2015-05-08      S
    5   45 2015-05-12      B
    2   12 2015-05-14      B
    10  51 2015-05-17      B
    14  76 2015-05-22      B
    1   12 2015-05-22      S
    9   51 2015-05-23      B
    

    Similarly, if you want all Bs before all Ss:

    tempDF.sort_values(by=['status', 'date'], ascending=True)
           .groupby('date', sort=False).head(2)
    )
    

    Output:

        id       date status
    0   12 2015-05-01      B
    4   45 2015-05-03      B
    13  76 2015-05-04      B
    7   51 2015-05-05      B
    5   45 2015-05-12      B
    2   12 2015-05-14      B
    10  51 2015-05-17      B
    14  76 2015-05-22      B
    9   51 2015-05-23      B
    8   51 2015-05-01      S
    6   45 2015-05-02      S
    11  51 2015-05-03      S
    12  51 2015-05-05      S
    3   12 2015-05-06      S
    15  76 2015-05-08      S
    1   12 2015-05-22      S