Search code examples
pythonpandasdataframedatemin

Python: Extract column name for earliest and latest columns


I have a dataset with multiple date columns as follows

Input Dataframe

> df
Date1       Date2       Date3
2008-08-11  NaT         2005-05-14
2010-06-18  2012-09-04  2003-01-16
NaT         2006-11-05  1999-03-31

Desired Output

I want to add a column minDatecol

> df
Date1       Date2       Date3       minDatecol
2008-08-11  NaT         2005-05-14  Date3
2010-06-18  2012-09-04  2003-01-16  Date3 
NaT         2006-11-05  2009-03-31  Date2

I tried df[cols].idxmin(axis=1) but getting an error as follows

ValueError: attempt to get argmin of an empty sequence

Any help you can provide?


Solution

  • Use:

    cols = ['Date1','Date2','Date3']
    df['minDatecol'] = df[cols].fillna(pd.Timestamp.max).idxmin(axis=1)
    print (df)
           Date1      Date2      Date3 minDatecol
    0 2008-08-11        NaT 2005-05-14      Date3
    1 2010-06-18 2012-09-04 2003-01-16      Date3
    2        NaT 2006-11-05 2009-03-31      Date2