Search code examples
pythonpandasdatetimerelativedelta

How can I find the elapsed business hours between two dates using pandas' CustomBusinessHour objects?


If I want to find the number of hours between two datetime objects, I can do something like this:

from datetime import datetime

today = datetime.today()
day_after_tomorrow = datetime(2022, 9, 24)

diff = (day_after_tomorrow - today).total_seconds() / 3600
print(diff)

which returns: 37.58784580333333 hours.

But this is the number of real hours between two dates. I want to know the number of specific business hours between two dates.

I can define two CustomBusinessHour objects with pandas to specify those business hours (which are 8AM to 4:30PM M-F, and 8AM to 12PM on Saturday, excluding US Federal holidays):

from pandas.tseries.offsets import CustomBusinessHour
from pandas.tseries.holiday import USFederalHolidayCalendar

business_hours_mtf = CustomBusinessHour(calendar=USFederalHolidayCalendar(), start='08:00', end='16:30')
business_hours_sat = CustomBusinessHour(calendar=USFederalHolidayCalendar(), start='08:00', end='12:00')

My understanding is that CustomBusinessHour is a type of pandas DateOffset object, so it should behave just like a relativedelta object. So I should be able to use it in the datetime arithmetic somehow, to get the number I want.

And that's as far as I was able to get.

What I think I'm struggling to understand is how relativedeltas work, and how to actually use them in datetime arithmetic.

Is this the right approach? If so, how can I use these CustomBusinessHour objects to get an accurate amount of elapsed business hours between the two dates?


Solution

  • I figured out a solution. It feels ugly and hacky, but it seems to work. Hopefully someone else has a simpler or more elegant solution.

    Edit: I cleaned up the documentation a little bit to make it easier to read. Also added a missing kwarg in business_hours_sat. Figuring this out was a headache, so if anyone else has to deal with this problem, hopefully this solution helps.

    from datetime import datetime, timedelta
    from pandas.tseries.offsets import CustomBusinessHour
    from pandas.tseries.holiday import USFederalHolidayCalendar
    
    business_hours_mtf = CustomBusinessHour(calendar=USFederalHolidayCalendar(), start='08:00', end='16:30')
    business_hours_sat = CustomBusinessHour(calendar=USFederalHolidayCalendar(), weekmask='Sat', start='08:00', end='12:00')
    
    def get_business_hours_range(earlier_date: datetime, later_date: datetime) -> float:
        """Return the number of business hours between `earlier_date` and `later_date` as a float with two decimal places.
        
        Algorithm:
    
        1. Increment `earlier_date` by 1 "business hour" until it's further in the future than `later_date`.
        2. Also increment an `elapsed_business_hours` variable by 1.
        3. Once `earlier_date` is larger (further in the future) than `later_date`...
            a. Roll back `earlier_date` by one business hour.
            b. Get the close of business hour for `earlier_date` ([3a]).
            c. Get the number of minutes between [3b] and [3a] (`minutes_remaining`).
            d. Create a timedelta with `elapsed_business_hours` and `minutes_remaining`
            e. Represent this timedelta as a float with two decimal places.
            f. Return this float.
        """
    
        # Count how many "business hours" have elapsed between the `earlier_date` and `later_date`.
        elapsed_business_hours = 0.0
        current_day_of_week = 0
        while earlier_date < later_date:
            day_of_week = earlier_date.isoweekday()
            
            # 6 = Saturday
            if day_of_week == 6:
                # Increment `earlier_date` by one "business hour", as specified by the `business_hours_sat` CBH object.
                earlier_date += business_hours_sat
                # Increment the counter of how many "business hours" have elapsed between these two dates.
                elapsed_business_hours += 1
                # Save the current day of the week in `earlier_date`, in case this is the last iteration of this while loop.
                current_day_of_week = day_of_week
    
            # 1 = Monday, 2 = Tuesday, ...
            elif day_of_week in (1, 2, 3, 4, 5):
                # Increment `earlier_date` by one "business hour", as specified by the `business_hours_mtf` CBH object.
                earlier_date += business_hours_mtf
                # Increment the counter of how many "business hours" have elapsed between these two dates.
                elapsed_business_hours += 1
                # Save the current day of the week in `earlier_date`, in case this is the last iteration of this while loop.
                current_day_of_week = day_of_week
        
        # Once we've incremented `earlier_date` to a date further in the future than `later_date`, we know that we've counted
        # all the full (60min) "business hours" between `earlier_date` and `later_date`. (We can only increment by one hour when using
        # CBH, so when we make this final increment, we may be skipping over a few minutes in that last day.)
        #
        # So now we roll `earlier_date` back by 1 business hour, to the last full business hour before `later_date`. Then we get the
        # close of business hour for that day, and subtract `earlier_date` from it. This will give us whatever minutes may be remaining
        # in that day, that weren't accounted for when tallying the number of "business hours".
        #
        # But before we do these things, we need to check what day of the week the last business hour is, so we know which closing time
        # to use.
        if current_day_of_week == 6:
            ed_rolled_back = earlier_date - business_hours_sat
            ed_closing_time = datetime.combine(ed_rolled_back, business_hours_sat.end[0])
        elif current_day_of_week in (1, 2, 3, 4, 5):
            ed_rolled_back = earlier_date - business_hours_mtf
            ed_closing_time = datetime.combine(ed_rolled_back, business_hours_mtf.end[0])
        
        minutes_remaining = (ed_closing_time - ed_rolled_back).total_seconds() / 60
    
        if 0 < minutes_remaining < 60:
            delta = timedelta(hours=elapsed_business_hours, minutes=minutes_remaining)
        else:
            delta = timedelta(hours=elapsed_business_hours)
        
        delta_hours = round(float(delta.total_seconds() / 3600), 2)
    
        return delta_hours