Search code examples
pythonpandasdataframedatetimedate-range

Pandas Aggregation - how to find nearest event based on event dates


I have a long list of events that are in a pandas dataframe with the event start and end dates for each event. How can I find the "nearest" event in each city & venue combination using pandas.groupby()?

Nearest event could be an event in the past, if no new upcoming events. i.e latest event in the dataframe in this case, which happened to be in the past. If there are multiple upcoming events, the one closest in the future will be considered as nearest event.

I tried groupby.agg("max") as below but that will give the event that is farthest in the future always:

dfp.groupby(['CITY', 'VENUE'], as_index=False).agg({"EVENT_START" : "max", "EVENT_END": "max"})

Looking for a way to get the event that is nearest in the future in time (and if no future events, closest in the past).

Sample data:

EVENT_START,EVENT_END,Event Description,City,Venue
2/5/2016,3/12/2016,event 1,Chicago,Art Institute of Chicago
11/2/2014,12/2/2014,event 2,Los Angelos,Party Haus
1/25/2018,1/31/2018,event 3,Long Beach,Precious Lamb
8/26/2018,8/31/2018,event 4,West Columbia,New Brookland Tavern
11/20/2017,12/17/2017,event 5,Paris,Orsay Museum
6/26/2018,7/9/2018,event 6,Lahaina,Bamboo Fresh
4/3/2010,5/2/2010,event 7,Mitchell,The Corn Palace
9/21/2015,10/18/2015,event 8,San Diego,San Diego Zoo
1/4/2014,1/15/2014,event 9,Portland,Doug Fir Lounge
9/21/2019,9/26/2019,event 10,St. Louis,Krispy Kreme
3/15/2015,2/13/2018,event 11,Corvallis,The Beanery
9/23/2005,10/2/2005,event 12,San Jose,Winchester Mystery House
12/11/2019,12/14/2019,event 13,Chicago,Art Institute of Chicago
6/1/2013,6/26/2013,event 14,Los Angelos,Party Haus
7/10/2020,9/4/2020,event 15,Long Beach,Precious Lamb
10/18/2020,11/26/2020,event 16,West Columbia,New Brookland Tavern
5/14/2004,5/16/2004,event 17,Paris,Orsay Museum
11/16/2020,11/20/2020,event 18,Lahaina,Bamboo Fresh
7/19/2020,10/22/2020,event 19,Mitchell,The Corn Palace
11/1/2017,11/30/2017,event 20,San Diego,San Diego Zoo
7/31/2015,8/1/2015,event 21,Portland,Doug Fir Lounge
10/12/2012,10/20/2012,event 22,St. Louis,Krispy Kreme
2/28/2003,3/13/2003,event 23,Corvallis,The Beanery
9/16/2019,9/20/2019,event 24,San Jose,Winchester Mystery House
3/1/2022,4/1/2022,event 25,Chicago,Art Institute of Chicago
2/19/2009,2/25/2009,event 26,Los Angelos,Party Haus
4/16/2015,5/8/2015,event 27,Long Beach,Precious Lamb
9/7/2016,9/11/2016,event 28,West Columbia,New Brookland Tavern
8/4/2001,8/26/2001,event 29,Paris,Orsay Museum
4/27/2017,6/11/2017,event 30,Lahaina,Bamboo Fresh
5/21/2011,6/19/2011,event 31,Mitchell,The Corn Palace
6/3/2020,8/10/2020,event 32,San Diego,San Diego Zoo
10/29/2012,11/15/2012,event 33,Portland,Doug Fir Lounge
9/1/2027,10/15/2027,event 34,St. Louis,Krispy Kreme
6/23/2017,6/25/2017,event 35,Corvallis,The Beanery
4/25/2007,5/26/2007,event 36,San Jose,Winchester Mystery House
5/30/2003,7/1/2003,event 37,Chicago,Art Institute of Chicago
3/14/2008,4/12/2008,event 38,Los Angelos,Party Haus
5/29/2017,7/27/2017,event 39,Long Beach,Precious Lamb
1/31/2015,3/7/2015,event 40,West Columbia,New Brookland Tavern
4/1/2017,4/21/2017,event 41,Paris,Orsay Museum
12/29/2003,1/31/2004,event 42,Lahaina,Bamboo Fresh
7/3/2021,7/17/2021,event 43,Mitchell,The Corn Palace
9/1/2004,4/30/2005,event 44,San Diego,San Diego Zoo
10/14/2006,10/27/2006,event 45,Portland,Doug Fir Lounge
7/18/2017,7/19/2017,event 46,St. Louis,Krispy Kreme
6/1/2006,6/1/2006,event 47,Corvallis,The Beanery
10/1/2012,11/4/2012,event 48,San Jose,Winchester Mystery House
9/5/2011,9/19/2011,event 49,Chicago,Art Institute of Chicago
5/28/2020,6/2/2020,event 50,Los Angelos,Party Haus
3/1/2023,4/1/2023,event 51,Chicago,Art Institute of Chicago

The result should be:

3/1/2022    4/1/2022    event 25    Chicago Art Institute of Chicago
5/28/2020   6/2/2020    event 50    Los Angelos Party Haus
7/10/2020   9/4/2020    event 15    Long Beach  Precious Lamb
10/18/2020  11/26/2020  event 16    West Columbia   New Brookland Tavern
11/20/2017  12/17/2017  event 5 Paris   Orsay Museum
11/16/2020  11/20/2020  event 18    Lahaina Bamboo Fresh
7/3/2021    7/17/2021   event 43    Mitchell    The Corn Palace
6/3/2020    8/10/2020   event 32    San Diego   San Diego Zoo
7/31/2015   8/1/2015    event 21    Portland    Doug Fir Lounge
9/1/2027    10/15/2027  event 34    St. Louis   Krispy Kreme
6/23/2017   6/25/2017   event 35    Corvallis   The Beanery
9/16/2019   9/20/2019   event 24    San Jose    Winchester Mystery House

Solution

  • # split past events and future events
    cond = df['EVENT_START'] > datetime.now()
    df_furture = df[cond]
    df_past = df[~cond]
    
    # keep the nearest furture
    dfn_furture = df_furture.sort_values(['City', 'Venue', 'EVENT_START'])\
                  .drop_duplicates(['City', 'Venue'], keep='first')
    
    # merge one closest furture event for every city and the past events
    dfn = pd.concat([dfn_furture, df_past])
    df_result = dfn.sort_values(['City', 'Venue', 'EVENT_START'])\
                .drop_duplicates(['City', 'Venue'], keep='last').sort_index()
    

    result:

       EVENT_START  EVENT_END Event Description           City  \
    4   2017-11-20 2017-12-17           event 5          Paris   
    14  2020-07-10 2020-09-04          event 15     Long Beach   
    15  2020-10-18 2020-11-26          event 16  West Columbia   
    17  2020-11-16 2020-11-20          event 18        Lahaina   
    20  2015-07-31 2015-08-01          event 21       Portland   
    23  2019-09-16 2019-09-20          event 24       San Jose   
    24  2022-03-01 2022-04-01          event 25        Chicago   
    31  2020-06-03 2020-08-10          event 32      San Diego   
    33  2027-09-01 2027-10-15          event 34      St. Louis   
    34  2017-06-23 2017-06-25          event 35      Corvallis   
    42  2021-07-03 2021-07-17          event 43       Mitchell   
    49  2020-05-28 2020-06-02          event 50    Los Angelos   
    
                           Venue  
    4               Orsay Museum  
    14             Precious Lamb  
    15      New Brookland Tavern  
    17              Bamboo Fresh  
    20           Doug Fir Lounge  
    23  Winchester Mystery House  
    24  Art Institute of Chicago  
    31             San Diego Zoo  
    33              Krispy Kreme  
    34               The Beanery  
    42           The Corn Palace  
    49                Party Haus