Search code examples
pythonpandasdataframegroup-by

Group By pandas dataframe and keep earliest date from one column and the latest date from another column in Python


I have the following data:

id   start_date   end_date
1    2023-01-01   2023-02-02
1    2023-02-05   2023-02-15
1    2023-02-16   2023-03-14

How can I group by id and keep the earliest date from start_date and the latest from end_date . E.g.

id   start_date   end_date
1    2023-01-01   2023-03-14

Solution

  • max and min work fine for dates. Use:

    df2 = df.groupby('id').agg({'start_date': min, 'end_date': max}).reset_index()
    

    to give:

       id  start_date    end_date
    0   1  2023-01-01  2023-03-14