Search code examples
pythonpandastimedelta

Pandas Timedelta Multiple If Condition


I would like to create a new column called time_of_day for a dataframe such as below so that 06-12:59 is morning, 13-17:59 is afternoon, 18-22:59 is evening and rest of the time is night.

id TIME
25 00:01
25 02:01
25 06:55
18 22:03
18 23:33
18 00:33

What I expect

id TIME TIME_OF_DAY
25 00:01 Night
25 02:01 Night
25 06:55 Morning
18 22:03 Evening
18 23:33 Night
18 00:33 Night

I tried the numpy vectorization with choices and conditions, but it failed. Then I tried the following:

def conditions(s):
   if (pd.to_timedelta(df['TIME']) >= pd.to_timedelta('06:00:00')) & (pd.to_timedelta(df['TIME']) < pd.to_timedelta('13:00:00')):
       return "Morning" 
   elif (pd.to_timedelta(df['TIME']) >= pd.to_timedelta('13:00:00')) & (pd.to_timedelta(df['TIME']) < pd.to_timedelta('18:00:00')):
       return "Afternoon" 
   elif (pd.to_timedelta(df['TIME']) >= pd.to_timedelta('18:00:00')) & (pd.to_timedelta(df['TIME']) < pd.to_timedelta('23:00:00')):
       return "Evening" 
   elif (pd.to_timedelta(df['TIME']) >= pd.to_timedelta('23:00:00')) & (pd.to_timedelta(df['TIME']) < pd.to_timedelta('06:00:00')):
       return "Night" 
df['TIME_OF_DAY'] = df.apply(conditions, axis=1)

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

How can I overcome this issue? PS: I added .all() for the enf od each if condition, but still got the same error.


Solution

  • Use pandas.cut, it will be more efficient than a function with apply:

    bins = ['0', '06:00:00', '13:00:00', '18:00:00', '23:00:00', '24:00:00']
    labels = ['Night', 'Morning', 'Afternoon', 'Evening', 'Night']
    
    df['TIME_OF_DAY'] = pd.cut(
      pd.to_timedelta(df['TIME']+':00'),
      bins=list(map(pd.Timedelta, bins)),
      labels=labels, right=False, ordered=False
    )
    

    output:

       id   TIME TIME_OF_DAY
    0  25  00:01       Night
    1  25  02:01       Night
    2  25  06:55     Morning
    3  18  22:03     Evening
    4  18  23:33       Night
    5  18  00:33       Night
    

    If you need a function:

    def to_day_period(s):
        bins = ['0', '06:00:00', '13:00:00', '18:00:00', '23:00:00', '24:00:00']
        labels = ['Night', 'Morning', 'Afternoon', 'Evening', 'Night']
        
        return pd.cut(
          pd.to_timedelta(s+':00'),
          bins=list(map(pd.Timedelta, bins)),
          labels=labels, right=False, ordered=False
        )
    
    df['TIME_OF_DAY'] = to_day_period(df['TIME'])